NASA-CR-199103 


Applied Information Systems Research Program 


(AISRP) 


Workshop III 


00 m 

rH | 

rvj l r\j 
-4* 13 </> 

m ou cn pd 

1 X I — 

in k u 
0 s I O c 
2: l z> 


Meeting Proceedings 


Laboratory for Atmospheric and Space Physics 
University of Colorado 
Boulder, Colorado 


August 3-6, 1993 


: a: o 

< uj ^ 

CC LU • 

o u > 
o a — 

ql <x c 
a cl o 


Proceedings Issued By: 


Information Systems Branch 
Office of Space Science 
NASA Headquarters 



APPLIED INFORMATION SYSTEMS RESEARCH PROGRAM 
WORKSHOP ffl FINAL REPORT 
TABLE OF CONTENTS 


Executive Summary 
Opening Remarks [Bredekamp] 

An Overview of the Software Support Laboratory [Davis] 

ARTIFICIAL INTELLIGENCE TECHNIQUES SESSION 

Holographic Neural Networks for Space and Commercial Applications [Liu] 

The SEIDAM (System of Experts for Intelligent Data Management) 

Project [Goodenough] 

t 

SIGMA: An Intelligent Model-Building Assistant [Keller] 

Multivariate Statistical Analysis Software Technologies for Astrophysical 
Research Involving Large Databases [Djorgovski] 

SCIENTIFIC VISUALIZATION SESSION 

The Grid Analysis and Display System (GrADS): A Practical Tool for 
Earth Science Visualization [Kinter] 

A Distributed Analysis and Visualization System for Model and 
Observational Data [Wilhelmson] 

Handling Intellectual Property at UCAR/NCAR [Buzbee] 

NCAR Graphics - What’s New in 3.2? [Lackman] 

McIDAS-eXplorer: A Version of McIDAS-X for Planetary Applications [Limaye] 
Experimenter’s Laboratory for Visualized Interactive Science (ELVIS) [Hansen] 
SAVS: A Space and Atmospheric Visualization Science System [Szuszczewicz] 



TABLE OF CONTENTS (continued) 


LinkWinds - The Linked Windows Interactive Data System [Walton] 

DataHub [Handley] 

Data Visualization and Sensor Fusion [McCullough] 

DATA COMPRESSION/ ARCHIVING/ ACCESS/ANALYSIS SESSION 
Data Compression [Storer] 

SAMS: A Spatial Analysis and Modeling System for Environmental 
Monitoring [Stetina] 

Land Surface Testbed for EOSDIS [Kelly] 

Development of a Tool-Set for Simultaneous, Multi-Site Observations 
of Astronomical Objects [Chakrabarti] 

Geographic Information System for Fusion and Analysis of High Resolution 
Remote Sensing and Ground Truth Data - Progress Report [Freeman] 

Envision: A System for Management and Display of Large Data Sets [Bowman] 

RESEARCH AND TECHNOLOGY SESSION 
Research and Technology Activities at JPL [Walton] 

Navigation Ancillary Information Facility - An Overview of SPICE [Walton] 

Using the NAIF SPICE Kernel Concepts and the NAIF Toolkit Software for 

Geometry Parameter Generation and Observation Visualization [Simmons] 

EOS/Pathfinder Interuse Experiment [Botts] 

Overview of Ames Research Center Advanced Network Applications [Yin] 


Summary and Action Items [Mucklow] 



APPENDICES 


A. AISRP Agenda 

B. Participants and Attendees 

C. Abstracts 

D. Demonstrations 


E. Presentation Material 



Right: 

Workstations provided attendees with 
first-hand demonstrations of the 
results of research projects. 


Below: 

Through demonstrations, new 
visualization techniques were explained 
to Jim Dodge and other workshop 
participants. 



6'- i 


AISRP Workshop III 


August 3-6, 1993 


EXECUTIVE SUMMARY 


e Workshop of the Applied Information Systems Research Program (AISRP) was again 
hosted by the University of Colorado’s Laboratory for Atmospheric and Space Physics (LASP) 
m oulder Colorado, August 3-6, 1993. The Workshop was sponsored and chaired by Glenn 
Mucklow of the Information Systems Branch (ISB) of NASA’s Office of Space Science The 
focus of this year s Workshop was "Technology Transfer," and focussed on the "applied" aspect 
of the Applied Information Systems Research Program. The AISRP investigators presented their 
progress to date, gave demonstrations of their tools and software, and addressed the technology 
transfer activities that have occurred and are planned or anticipated. C 

The presentations were organized into four sessions: Artificial Intelligence Techniques, chaired 
by Dr. Richard Keller from Ames Research Center; Scientific Visualization, chaired by Dr Theo 
Pavhdis from State University of New York; Data Management and Archiving, chaired by Dr 
lames Storer from Brandeis University; and Research and Technology, chaired by Dr Amy 
Walton from the Jet Propulsion Laboratory. Mr. Joseph Bredekamp, Head of the ISB, opened 

the Workshop with some remarks on the recent NASA reorganization and new directions for the 
agency. 

Mr. Bredekamp encouraged the group to look at the topic of technology transfer, and discuss 
some lessons learned on teaming and partnerships. One of the evaluation criteria in future NASA 
Announcements of Opportunity will be technology transfer. In the past, there has been policy 
conflict surrounding this issue-the administration wants to promote U.S. competitiveness, and 
the science focus is on cooperative agreements and collaboration. The program is working on 
ways to promote both. 6 

Before the sessions started. Dr. Randy Davis, our host from LASP, described the Software 
Support Laboratory (SSL), established as a result of last year’s Workshop, which provides 
researchers with a "one-stop" location to learn about the tools that are available and how they 
can be accessed or obtained. It also hopes to be a conduit of information to software developers 
as well as provide the science community with products and trends. 

During the general discussion sessions, several topics of continuing interest to the participants 
were raised: issues related to technology transfer, including how to better get the new tools out 
into the reseat ch community; issues related to hardware portability and design of tools and 
products; meta data and data description, including a standard framework in which to provide 
data information; issues related to archiving data, including the role of government and the 
private sector, as well as market needs as drivers; and network accessibility for researchers. 
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the AISRP, will help to meet this challenge. 


Glenn H. Mucklow 
Program Manager 

Information Systems Research and Technology 
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Tuesday, August 3, 1993 


Mr. Glenn Mucklow, the Workshop Chairman from NASA’s Office of Space Science, welcomed 
the participants and thanked the Workshop host. Dr. Randal Davis and the Laboratory for 
Atmospheric and Space Physics (LASP) at the University of Colorado. The theme for this year’s 
Workshop was "Technology Transfer," and focussed on the "applied" aspect of the Applied 
Information Systems Research Program (AISRP). 


Opening Remarks 

Mr. Joe Bredekamp 

Information Systems Branch (ISB) 

NASA Headquarters Office of Space Science 

The AISRP s progress has been very encouraging, and it is developing products and starting to 
see some payback. In addition, the program is having a broadening influence within NASA and 
in the science community. 

The new NASA Administrator is an agent of change, reflecting the direction of the new 
administration. A major message that is consciously being conveyed both internally and 
externally is that NASA must change its way of doing business. There have been major 
reorganizations within the agency. The Office of Space Science and Applications (OSSA) has 
been reorganized into three science program offices: Mission To Planet Earth (MTPE) the 
Office of Space Science (OSS), and the Office of Life and Microgravity Sciences ’ and 
Applications (OLMSA). The Information Systems Branch (ISB), which supports all three science 
program offices and resides organizationally in the Office of Space Science, was kept as whole, 
integrated, and coherent as possible. The ISB has been going through a very positive transition 
over the past several months, but it has become clear that no assumptions can be made about 
what is securely in the program, and the organization will be looking carefully at everything. 
The administration is focussing on a different strategy-exploiting technology for broader 
applications. The ISB has been developing an integrated technology strategy that is aimed at 
developing, utilizing, and transferring technology to improve mission effectiveness and infusing 
this technology into the mainstream of the commercial marketplace. Information technology has 
an opportunity to make a significant difference in science operations and mission operations to 
reduce costs and increase the science return. 
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OSS’s Process Assessment Team (PAT), chaired by Mary Kicza is currently working in 
partnership with the Office of Advanced Concepts and Technology (OACT) to come up with a 
pSTto identify technologies, infuse them into missions, and transfer them to the marketplace^ 
The organization is also working with the Office of Space Communications to exploit ground 
operations, and will be doing flight/ground trade studies to try to reduce costs and improve 
effectiveness. Mr. Bredekamp indicated that the final draft of the Integrated Technology Strategy 
is in preparation, and that copies would be distributed to the AISRP group. 

Mr Bredekamp encouraged the group to look at the topic of technology transfer, and discuss 
some lessons learned on teaming and partnerships. One of the evaluation catena m ature O 
will be technology transfer. In the past, there has been policy conflict surrounding this issue-fee 
administration wants to promote U.S. competitiveness, and the science focus is on cooperative 
agreements and collaboration. The program is working on ways to promote both. 


An Overview of the Software Support L aboratory 

Dr. Randal Davis 

LASP, University of Colorado 


One outcome from last year’s Workshop was 
at LASP to provide researchers a "one-stop" 
and how they can be accessed or obtained. 


the establishment of a Software Support Laboratory 
location to learn about the tools that are available 


The SSL is both a research project and a working software repository and distribution center for 
the AISRP, and hopes to be a conduit of information to software developers as well as provide 
the science community with products and trends. 


Various levels of data management entities have grown up to provide better services for 
distributing data to the science community (through discipline data management units), and have 
begun to develop standards and practices within disciplines. The same type of thing needs to be 
done for software. Although the NASA archive, COSMIC, attempts to do tins t to some «tent 
it does not have the resources to support software and keep it active and viable for the science 

community. 

After last year’s meeting, this issue was addressed, and three models were identified regarding 
software distribution: 1) another party maintains and distributes software products and 

documentation; 2) the SSL maintains and distributes software and documentation (either via 
minimal or full support); and 3) the party developing or maintaining the software is seamlessly 
linked with the SSL’s information and order services. 
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One o f the common questions is what happens when industry steps in as partners and wants to 
modify the software. The issue arises when the modified product is of commercial value and 
there is desire to put software on ftp. It is unclear what role the SSL should have in 
commercialization. At the present, the primary purpose of the SSL is to facilitate a smooth 
transition on developed products and remain useful to the science community. 


information services for software developers and users-the World Wide 
Web (WWW) and the Wide Area Information Service (WAIS). Dr. Davis gave a demonstration 

0t “ n f service - The SSL expects to continually distribute white papers and notes aimed 
at both developers and users, and make documents available through the web. 


A set of CD-ROM disks is being produced by the SSL and CSAT with sample Earth and space 
science data for use in testing and evaluating software. Most of NASA’s science disciplines will 
be represented, most common data object types and formats will be included, and all of the 

da i?“ ^ documentation 311(1 software for display. The SSL is considering producing 
a CD-ROM sampler disk with software developed under the AISRP, other popular application 
programs, and libraries of routines to support access to popular data formats. 


Dr. Davis invited comments and suggestions regarding the initial version of the on-line SSL 
mlormation service, as well as comments and suggestions on a software sampler CD-ROM The 
SSL needs information for the abstracts on the AISRP projects and software. 


The discussions focussed on the problems associated with software developed for different 
platforms, and what software can be supported in-house. Some of these problems may be 
conquered through use of a multi-platform software sampler. 


SESSION I - ARTIFICIAL INTELLIGENCE TECHNIQUES 
Chair: Dr. Richard Keller, Ames Research Center 

H olographic Neural Net w orks for Space and Commercial Applications 

Dr. Hua-Kuang Liu 

Jet Propulsion Laboratory 

The holographic neural network provides a pattern recognition system that can greatly amplify 
the target signal while filtering out the background noise and clutter. Partial input of very dimly 
illuminated targets can be recognized, and there are a number of NASA and commercial 
applications for this technology. 

Dr. Liu discussed the approach and technical basis for the project, and provided a demonstration 
of the pattern recognition. Private industries have already expressed interest in the new 
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technology and end product of the project. Funding is provided by bod. NASA and ARPA. The 
fcasibility study has been completed this year, and the proton wtll be ready ,n 1995. If 
funding continues, this technology will be in users hands in 2000. 


The SEIDAM f System of Experts for I ntelligent Data Management) Protec t 

Dr. David Goodenough 

Department of Natural Resources Canada 

Intelligent data management systems am required to support dm increased complexity and 
“ of remote sensing dam that is generated. SEIDAM, a NASA pntjec, supported by .he 
Government of Canada and the Province of British Columbia, is a response to this nee . 
system integrates remote sensing data from satellites and aircraft with geographic information 
systems and^anages large archives of remotely sensed data for query based recognition of forest 
object^ appropriate for environmental forest monitoring. SEIDAM leams from examples or 
cases, consisting of a query, the data, and the desired goal. 

Dr Goodenough provided examples of queries and how SEIDAM answers those queries. The 
SEIDAM structure is a hierarchy of expert systems. Machine learning will be used to expedite 

the creation of knowledge and new expert systems and to learn JiifferenTdata 11 sources-- 
construct the SEIDAM system, the project needed to accommodate different data sources 

satellites aircraft sensors, and field measurements. Dr. Goodenough showed how learning 
reduced ’the number of new rules (learning performance) for supervised classification and 
displayed slides of sensor data and visualization examples. SEIDAM will connect to LandDat 
BC, the future provincial land information system. 

The first release of the Meta Data Model is complete, and the Meta Data Management subsystem 
is being developed now. In 1993 and 1994, satellite and aircraft acquisitions will take place ovu 
three test sites, and SEIDAM products will be demonstrated for these selected test sites. 

After the presentation, the meeting participants discussed the SEIDAM level of accuracy, meta 
data, and utilization of commercial products. 


SIGMA: An Intelligent Model-buildi ng Assistant 

Dr. Richard Keller 

Ames Research Center (ARC) 


Over the past three years, the preject goals have shifted. Initially, the project was to develop an 
Mligenf dontain-speciftc tool to assist planetary scientists in building sctenttftc models. The 
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current goal has expanded to include the development of an intelligent generic tool to assist 
scientists in a variety of domains. SIGMA is a model building shell which consists of GUI, a 
scientific knowledge base, and knowledge acquisition tools. Scientists supply domain-specific 
knowledge and modeling scenario(s), and the product is a customized scientific modeling tool. 

Dr. Keller addressed some of the difficulties inherent in model-building, and how SIGMA 
provides new tools to produce the code and facilitate the model building. SIGMA has been used 
to reproduce two models reported in scientific literature- Voyager I data analysis and forestry 
data. In addition, there have been numerous publications on the tool and a symposia. More 
work needs to be done on knowledge acquisition and maintenance, and control constructs need 
to be incorporated. Also, SIGMA cannot yet handle coupled equations, and for large data sets, 
a compiler needs to be built. The project is currently in an alpha test by the collaborators, and 
participants have been identified for beta tests. The project is also looking at moving into a new 
modeling domain-life support and exobiology. 

Technology transfer plans include: a graduate student seminar at the University of Montana; a 
modeling seminar at the upcoming Ecological Society of America conference; a Cassini modelers 
meeting; and commercialization opportunities through the NASA Technology Applications Team. 

In response to a question regarding how SIGMA compares with STELLA, Dr. Keller indicated 
that the types of models that can be built with STELLA are more restricted because knowledge 
of particular domains cannot be built into that system. 


Multivariate Statistical Analysis Soft w are Technologies for Astrophvsical Research Involving 

Large Databases 

Dr. George Djorgovski 

California Institute of Technology 

The project consists of the development of STATPROG, a user friendly, science-driven package 
for multivariate statistical analysis of relatively small data sets, and the development of SKICAT, 
an AI based system for processing and analysis of about 3 Tb of digital image information from 
the Second Palomar Sky Survey, and other present and future large astronomical data sets. 
SKICAT is a collaborative project between JPL and California Institute of Technology. 

SKICAT consists of catalog construction elements (AutoPlate and AutoCCD) and external 
catalogs (IRAS, ROSAT, etc.). Catalog management is currently in progress, and the scientific 
analysis elements are yet to be developed. Dr. Djorgovski discussed the science drivers that 
require a cataloging of galaxies and point sources. The results will serve to validate galaxy 
theories and also provide a quality control check on the Space Telescope Science Institute. The 
major parts of SKICAT have been generated and tested, and some general utilities have been 
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developed. The results of the initial scientific verification and testing will be of great interest 
to astronomers. Future efforts include the completion of the catalog manipulation module and 
exploration of the high level machine learning and machine discovery techniques. Long term, 
the project aims to develop an evolving catalog through expert systems and machine learning 

techniques. 


The focus of the discussion period for Session I was technology transfer. One of the challenges 
is how to get information on developing technologies out to the science community. e 
University of Maryland investigators were successful in using AGU this year for two hal - ay 
sessions When scientists start seeing published results using these tools, then the too s wi 
move into the community more extensively. In certain discipline areas, such as astronomy and 
EOS the old techniques will not work with the new orders of magnitude of the data sets, and 
effective tools will be necessary. The tool developer needs to be part of the "instrument builder 
tradition. It is especially important that the scientists in the disciplines include attnbution of tools 

in their publications and proposals. 


At the AGU, there was a lot of interest in the tools, and what worked very well was 
demonstration of the tools by scientists instead of vendors. Based upon a survey of AGU 
attendees, the format of both oral presentation and demonstration was good. 


Several ideas were suggested regarding ways to get published in mainline journals, including 
collecting several superior scientific papers and getting a special issue published. 

After the discussion session, demonstrations of the AISRP projects tools and software were 
conducted. The items demonstrated are listed in Appendix D. 
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Wednesday, August 4, 1993 


SESSION 0 - SCIENTIFIC VISUALIZATION 
Chair: Dr. Theo Pavlides 

The Grid Analysis and Display System (GrADS): A Practical Tool for Earth Science 

Visualization 

Dr. James Kinter 

University of Maryland 

The GrADS was intended to be an Earth science tool, but is not strictly limited to weather 
graphics. The most important part of the GrADS design is integration-it integrates the 
manipulation of expressions and functions, access to data, and display in the form of maps, 
charts, and animation. It is interactive, easy to learn and use, and produces hardcopy via vector 
graphics. 

Since last year, GrADS has been expanded in terms of data sets. It now supports GRIB and DRS 
as well as GrADS internal, and can support packed binary, ASCII, and netCDF. Currently, there 
are hundreds of GrADS users at over thirty institutions. It has been very successful for a number 
of reasons. It is easy to use and learn and it allows scientists to access and manipulate their data 
without additional programming. The software that is distributed is very simple for a scientist 
to install, and is platform independent-it runs on Unix workstations and PC’s. GrADS has been 
distributed free of charge, and the Center has been willing to work with user groups to add the 
functionality they request and eliminate user problems. GrADS is used in research, in education, 
in forecasting, and for public information. 

Dr. Kinter gave demonstrations of GrADS using others’ software for a variety of research and 
forecasting applications. After the demonstrations and a brief video. Dr. Kinter described the 
major new features of GrADS, and what has been accomplished over the past year. Scripting 
language added during the past year permits users to customize all aspects of GrADS, and allows 
interaction with display for interactive scientific investigation using one’s own data. 


A Distributed Analysis and Visualization System for Model and Observational Data 
Dr. Robert Wilhelmson 

National Center for Supercomputing Applications (MCSA) 

The project is part of a larger effort called PATHFINDER at NCSA. It is designed to bring to 
the researcher a flexible, modular, collaborative and distributed environment for use in studying 
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fluid flows, and one which can be tailored for specific scientific research and weather forecasting 
needs. 

Since last year, a lot more work has been done with AVS. AVS is used to convert a subset of 
PATHFINDER modules on Convex and SGI machines. It is highly interactive, supports 
distributed computing, is extensible, provides user interface tools, and has a rich collection of 
existing modules. Other accomplishments over the past year include: the = “corporation of 

GEMVIS, a subset of PATHFINDER which is a set of capabilities targeted to the GEMPAK user 
community; an animation tool; meta data; improved access to HDF; modification of some of the 
visual capabilities; laser disk recording; and prototyping and Explorer alpha and beta testing. 

In September, the beta release of NCSA PATHFINDER Explorer modules 
will be made to the NCSA anonymous server, and GEMVIS will be released to Unidata, n 
October NCSA plans the beta release of the Inventor animation tool, and release of GEM 
to COSMIC. Final release of the NCSA Pathfinder Explorer modules is planned for summer ot 

1994. 

In response to a question regarding large data sets and the difficulties in handling them with SGI 
Dr. Wilhelmson agreed that there is need for a user tool that will do better memory management, 
which is very important for 3-D animation. 


Handling Intellectual Property at UCAR/NCAR 
Dr. Bill Buzbee 

National Center for Atmospheric Research 


There is a clear intellectual property issue whenever U.S. government funded work has gone 
abroad, and is used to develop products that are then sold back to the U.S. In addition, there is 
also an issue over how to take technology that has been developed with government funds and 
make it available to the private sector in a way that protects private investment and commercial 
viability. The UCAR/NCAR has developed an approach that addresses this issue. 

The University Consortium for Atmospheric Research (UCAR) has a committee on intellectual 
nrooertv and it works with the UCAR Foundation, a parallel organization that licenses 
technology that has commercial value. Dr. Buzbee described the intellectual property process 
for software The developer discloses the intellectual property, and the UCAR Intellectual 
Property Committee assesses the market potential. If the committee determines that no market 
exists, the ownership is assigned to the developer or the public domain. If a market does exist 
UCAR proceeds to copyright or patent the property and the National Science Foundation ( ) 

obtains a nonexclusive and royalty-free license for government purposes. A dollar award is given 
to the employee(s), and the UCAR Foundation seeks a licensee. The license may be exclusive 
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or non-exclusive, depending on circumstances. A user license includes maintenance and user 
assistance via e-mail and phone, and the product is provided to universities at below cost. UCAR 
has found this to be an effective approach for handling technology transfer, and it provides 
protection to the value-added remarketer and for U.S. intellectual property. 


NCAR Graphics- What’s New in 3.2 
Dr. Bob Lackman 

National Center for Atmospheric Research 

Dr. Lackman discussed the latest features of NCAR Graphics. Functionality has been improved 
through C language binding, raster contouring, high quality fonts, new documentation, utility 
upgrades, an interactive display tool (idt), and CGM translators. NCAR has also added some 
unsupported directories (Explorer modules and high level utilities). Version 4.0 will be out in 
late 1994 and will include the high level utilities, the NCAR command line (NCL) interpreter, 
NCL scripting language with loops and conditionals, a prototype GUI, and a user guide with 
tutorial and reference manual. 

There were a number of questions related to the intellectual property portion of the NCAR 
presentation. The criteria for exclusive or non-exclusive licenses is primarily hardware driven. 
UCAR has occasionally gone outside for opinions, but typically the decision on whether property 
has market value is made by the Committee. If the product is not exclusively NCAR’s, then 
legal agreements must be made with other parties before going through the licensing process. 
It is NCAR’s policy that it will not include software unless these agreements are obtained. 
NCAR’s ratio of success for commercially viable transfer is better than 50%. 


McIDAS-eXplorer: A Version of McIDAS-X for Planetary Applications 
Dr. S. S. Limaye 
University of Wisconsin 

The objective of the program was to adapt the McIDAS environment for analysis of planetary 
data for use on small computers/workstations so that it is useful for research, operations, and 
education. One of the most demanding challenges at this point in time is a support mechanism 
for users. 

Currently, the project has a basic working core system on hand. No major problems have been 
encountered in the software development, with the exception of dynamic Magellan de-calibration. 
Most difficulties have been presented by the inability of different versions of Unix to read the 
PDS CD-ROM volumes. The incorporation of the NAIF/SPICE library in pre-existing software 
has progressed slower than anticipated, but no problems have been encountered. 
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The project plans to exhibit McIDAS-eXplorer at the annual meeting of the AAS/Division of 
Planetary Sciences in October. Dr. Limaye’s group will experiment with some of the Mars 
Observer data when it is available for a trial run of use in a routine operational environment. 

Dr. Limaye gave a demonstration of the program from a workstation. He demonstrated multi- 
frame display, query of data sets, animation, etc. 


Experimenter’s Laboratory for Visualized Interactiv e Science (ELVIS) 

Dr. Elaine Hansen 
University of Colorado 

ELVIS is the integration of a number of technologies, tools, and theories to display and 
manipulate data. A core component of the system is PolyPaint, which enables 3-D rendering. 
The project has applied advanced human factors techniques and theories, including user 
interviews, cognitive walk-throughs, controlled user tests, and prototype evaluations. PolyPaint 
has been upgraded to handle new 2-D objects, and TAE+ has been used and augmented. ELVIS 
has extensible design which will support a scalable set of functions for shaded surface or 
volumetric renderings in real or indexed color. 

Dr Hansen demonstrated some of the features of ELVIS, including interactive user interface, 2-D 
and 3-D renderings of some typical space science data sets, the direct manipulation 3-D view and 
lighting editor, the intuitive color editor, and the PolyPaint rendering package. 

Future plans include: integration with DataHub; integration of the spreadsheet engine, interface 
to graphics hardware via Inventor; integration of visualization of 2-D graphics in 3-D space; an 
alpha release in September; user interviews and user evaluations with their own data sets; and 

beta release at the end of the project. 

Technology transfer has already been occurring in the development phase and will continue 
through the third year of the program. Dr. Hansen discussed a number of ways in whic 
technologies have been or are being transferred-through transfer of advanced software tools and 
theories, transfer of intellectual resources, technology leveraging, evaluation testing, and 
technology insertion. 


The general discussion for this Session focussed on hardware portability. The hardware base ot 
the development community is not the same base as the science/university community, s t e 
next generation user community going to have SGI’s, or are they going to stay with 
DOS/Windows systems? Some participants thought that tools and products should be develope 
for the next generation of systems, while others thought that tools should be written for what the 
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community has or will have. There was also some discussion on the issue of efficient use of 
resources. There are two modes of operation for most users: use of a high powered tool at a 
central location at the university, plus a variety of different types of hardware at the scientists’ 
desks. Products need to address both modes, or have software that is scalable. As networks 
improve, some of these issues will be resolved, and data compression techniques may also help 
address the issue. Most participants felt that visualization packages should not be dependent 
upon a particular platform, but networks should be used to access resources wherever they exist. 
In general, users need tools that are modular, easy to use, and extensible. 


^?AVS: A Spac e and Atmospheric and Visualization System 

Dr. E. Szuszczewicz 

Science Applications International Corporation (SAIC) 

The three major contributors to SAVS are the Laboratory for Atmospheric and Space Science at 
SAIC, Advanced Visual Systems (AVS), and the University of Maryland. SAVS is focussed on 
the multi-disciplinary databases designed to understand the cause-effect relationships in the solar- 
terrestrial system. It is composed of: widely accepted, commercially available visualization 
software with a heavily leveraged international users’ module library; advanced distributed 
database techniques; and mathematical, analytical, and image processing tools. 

In addition to the solar- terrestrial applications, SAVS is extensible to the lower atmospheric. 
Earth, and ocean sciences. SAVS is an integrated system that is easy to use, extensible, portable, 
financially accessible, and uses an end-to-end approach. Dr. Szuszczewicz discussed the seven 
major tasks in the project, the accomplishments over the past year, and future projected activities. 


The SAVS demonstration focussed on the new, easier to use interactive front end. SAVS 
demonstrated an application using orbits, model, data, visualization, special algorithms, a 3-D to 
I'D interpolation model, and remote data access. A special model has been developed to deal 
with the 3-D to 1-D interpolation. Dr. Szuszczewicz also demonstrated the ability to do remote 
data access realtime through NSSDC to data files in the SAIC laboratory. Activities in the third 
year will lead to full integration and tuning of all customized modules, and delivery of software 
and documentation, including the users’ manual. 
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LinkWinds - The Linked Windows Interac tive Data System 

Dr. Amy Walton 

Jet Propulsion Laboratory 


LinkWinds provides a suite of tools to interactively visualize, explore, and analyze large, 
multivariate and multidisciplinary data sets. It will support the rapid prototyping and execution 
of data analysis and visualization, and allow maximum data and tools accessibility with a 

minimum of training. 


During the past year, LinkWinds version 1.4 was released to a significantly expanded number 
of scientific 5 groups at a variety of institutions. New applications were added and existing 
applications expanded. In addition, the project implemented a realtime data interface 
LinkWinds and used it to support the University of Iowa’s Plasma Wave Spectrometer aboard 

the Galileo spacecraft. 

To facilitate technology transfer, the project has made ingestion of data as easy as P° r S ^- 
Direct LinkWinds interfaces have been provided for key datafile formats (HDF, CDF > and R * 
and arrangements have been made to provide DataHub. All new users will be provided with an 

interactive realtime tutorial using MUSE. 


Dr Walton conducted a brief demonstration of LinkWinds. Currently, there are ten sites actively 
using LinkWinds in research, and feedback from scientists has been very positive. In response 
to a questions. Dr. Walton indicated that LinkWinds uses straightforward grid data, and although 
it handles 3-D data sets, it does not do data rendering. 


DataHub 

Dr. Tom Handley 

Jet Propulsion Laboratory 


DataHub is a value-added, knowledge-based server between the data suppliers and the data 
consumers. DataHub understands the different scientific data models, and addresses a variety o 
needs. Dr. Handley described the functional architecture of DataHub, as well as the data model. 


Initially, the project attempted an artificial neural network, but is now investigating machine 
learning techniques for feature detection and learning. Although DataHub was developed with 
an emphasis on physical oceanography, it is now processing and extracting information from 
multi-spectral data. DataHub is being used (and delivered) with other applications, such as 

LinkWinds and PolyPaint. 
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Dr. Handley walked through a demonstration of the DataHub input, process, and output modes. 
In the near term, the project will be working on a journal and transaction manager, a more 
general data model for user-defined data and conversion, and an expert system for dataset type 
checking. Long term work will include a context-based and content-based data search capability, 
quality control and "corporate" memory of the conversion of input data points to output data 
points, and a more general search path and browsing mechanism. In response to a question. Dr. 
Handley indicated that the project is now looking at how to handle non-gridded data. The 
interface could be broadened to include connection to a visualization tool, but it would need a 
data model big enough to handle complicated, non-gridded data sets. Also, the project is trying 
to make DataHub independent of any tool or database. 


Data Visualization and Sensor Fusion 
Dr. Vance McCollough 
Hughes Aircraft Company 

This project, a collaboration between the University of Chicago and Hughes, is aimed at the 
application of medical imaging and visualization techniques to remote sensing. The DMSP data 
set was selected for study. Image processing and visualization algorithms were developed and 
tested on CAT and PET medical data. This included image matching using landmarks, 
contouring and thresholding, image linking, 3-D image processing/visualization and superposition 
of 2-D images, and image-indexed color table generation. 

Dr. McCollough described the characteristics of the DMSP data set, and how calibration, 
navigation, projection, and gridding was accomplished. Medical imaging software was not as 
well suited to projection and gridding problems as had been anticipated. Dr. McCollough showed 
examples of thresholding sensitivity, image-based color table indexing, and three-dimensional 
visualization of IR images. The work done at Hughes showed the results of how the principal 
components transformation of microwave channels can extract useful data, and the principal 
components can be used to drive hue, saturation, and intensity to produce useful visualizations 
of multi-channel data. 

For the remainder of 1993 and 1994, the project plans to: apply visualization algorithms to 
remote sensing data, acquire additional data for evaluation: evaluate visualization results; 
investigate applications of principal component techniques; and do additional work on 3-D 
visualization of IR images. 


In the general discussion on Session II, the primary topic was meta data and data description. 
One observation made was that some tools are very useful, but are based on uniformly gridded 
data. Some space science is not image-based, and the issue is whether or not these tools can be 
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adapted for those purposes. There was general agreement that data description needs to be well 
documented, precise, and concise. If data can be described in a complete and concise manner 
the need for data standards is not as critical. What is needed is a standard framework in which 
to provide data information-a "format of formats." In addition, data structure needs to be 
considered separately from data format. There was discussion on the advantages and 
disadvantages of general use of tools vs. development of tools for specific purposes. Deve oping 
the tools, using them intelligently, and understanding the data are all separate issues. 


Thursday, August 5, 1993 


SESSION HI - DATA COMPRESSION/ARCHIVING/ ACCESS/ ANALYSIS 
Chair: Dr. James Storer 

Data Compression 
Dr. James Storer 
Brandeis University 

Data compression research at Brandeis includes lossless compression, image compression, video 
compression, and error resilient compression. This presentation focussed on on-line image 
compression with adaptive vector quantization (VQ) with variable sized vectors. VQ is a lossy 
technique which has not been used extensively in the past due to its static characteristics. 
However, new work on a version of VQ overcomes this disadvantage. The on-line adaptive VQ 
is characterized by an "evolving” dictionary. 

Dr. Storer described how the generic encoding algorithm works. Wave has proved to be the most 
efficient growing strategy, and all of the results shown in the presentation used the wave strategy. 
Dr. Storer also described how the growing points were selected, and how the best rectangle is 
chosen from the dictionary. The dictionary was modified by growing larger rectangles from 
smaller ones used in the past. The test sequence included a variety of ‘mages- 
experimental runs were done for each file, and compression levels were compared with JPEG. 
JPEG is tuned for magazine photos, and JPEG compression was superior on those images. 1 he 
adaptive VQ compression was tuned for more unusual images. 

A video compression approach is being developed using a superblock displacement estimation 
technique The superblock technique gives a factor of three to four improvement over fixed size 
blocks For science, a lot of images can be treated as video because the sequences look like 
video and displacement estimation techniques are very effective. The goal is realtime high 
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fidelity video and image sequence compression of over 1000 to one. The project expects to 
report more on this area next year. 


SAMS. — A Spatial Analysis and Modeling System for Environmental Monitorin g 
Dr. Fran Stetina ~ ~ 

Goddard Space Flight Center 

The objective of this project was to develop an end-to-end system to support environmental 
monitoring and management. It was a low-end pilot study for EOS direct readout. SAMS has 
incorporated realtime data reception, multi-discipline analysis and modeling, and expert systems. 
Dr. Stetina emphasized that this is not "cutting-edge" technology, but an attempt to adapt and 
integrate some new tools into existing technology that is reliable and maintainable. Most of the 
SAMS activities have been involved in integrating elements that already exist. The front end of 
the system can take any satellite data, realtime, and is automatic and transparent to the user. 
SAMS is multi-discipline— it can handle atmospheric, land, and ocean processes. It has modules 
that will analyze data to support marine resources, water resource management, disaster 
management, and agriculture, as well as environmental monitoring. SAMS has a built-in DMS, 
performs data fusion, and is involved in distributing products through Internet, Peacesat, and 
Earth Alert, as well as hard copy and storage. 

SAMS, or a subsystem of it, has been utilized in ten projects or field operations centers around 
die world. SAMS started in 1988 with a direct readout, and will move into EOS direct readout 
in 1995. Dr. Stetina described the accomplishments at the agricultural and environmental sites 
around the world that utilize SAMS. The software is available for users that have the hardware. 
In response to a question. Dr. Stetina indicated that the project wants to do more work with data 
compression, and this has been looked at for some of the data. 


Land-Surface Testbed for EOS 
Mr. Tim Kelly 
University of Colorado 

Mr. Kelly described the historical background of Sanddunes, a land-surface image system for 
Earth resources. Current work in process includes the development of a stand-alone data archive 
system, the addition of DOMSAT data, and the addition of GrADS overlays. The total number 
of registered users (based upon login records) has risen to over 2500 as of June 1993, and the 
project is shipping about 250 images per month. 
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The new element of the system, called Navigate, enables Sanddunes to send a navigate comma 
to an IBM 6000 cluster (which navigates the images) and receive images into the user file, 
svstem is designed to be totally self sufficient. Plans are to expand the present system to include 
elobal AVHRR data via the DOMSAT antenna, to assess the capability of the new independent 
fmt svs”m to handle the greater volumes of DOMSAT and other satellite data, and to explore 
the capacity of the new network capabilities to handle the transfer of all stored and new y 
collected data This testbed will transition to the Version 0 EOSDIS before December 1994, an 
will be located at either JPL or GSFC, or both. Until then, the Sanddunes will reside at LAS . 
Although the current system is free, a potential user must be on Internet. 


TWlnmnent of a Tool-Set for S i multaneous. Multi-Site Observations of Astronomical Object s 

Dr. S. Chakrabarti 
Boston University 

The primary purpose of this project was for education-to make the system available to all 

obiect requested is observable. If no suitable telescope is available or the observation is not 
SleTelr message is relayed back to the user. If a telescope is found and observatton 
i possible, a telescope interface is established and the user starts observation. 

nr Chakrabarti described several elements of the system which are used to link the user to one 
telescope at a time, allow simultaneous requests to more than one telescope from a sing e user 
^dXw monitoring of an observation in progress. There am three types of users of this 
wstem-resident observers, remote observers, and remote watchers (who have no interactive 
capability) The project is continuing to work networking issues (robustness, fault toleranc 
security^and data^ompression) and software issues (portability, support, and public domain^ 
Currently, all of the hardware and networking tools are in place, and some user interface has 
been written Future activities include: improvements to the user interface, realtime e , 
addition of imaging, spectroscopy, and interferometry capabilities; the addition of other platforms, 
and the addition of analysis and artificial intelligence tools. 


The general discussion session focussed on the topic of archiving data. The Wo ^shop 

participants discussed the pros and cons of letting market needs dnve 

data archive For the Global Change Research Program, the government needs to have a 

fo historical [environmental data, and a case was made that the government should be involved 

in the archive of this type of data. However, it was recognized that currently there is not an 
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infinite resource for keeping data, and that sensors have the capability of generating more data 
than can possibly be stored. One idea presented was to archive the raw data, but not all of the 
data products. In the future, with new state-of-the-art media, storage may not be as much of an 
issue as access. 


Geographic Informatio n System (GIS) for Fusion and Analysis of High Resolution Remote 
Sensing and Ground Truth Data 
Dr. Anthony Freeman 
Jet Propulsion Laboratory 

The overall aim of the project was to expand the user community for radar images by simplifying 
data display, analysis, and interpretation. The project started out by adapting a VICAR/IBIS GIS 
including models and different data sets, especially radar images. It achieved a working version 
of the system with a peer user interface. Other achievements included the development of 
MacSigma 0, work on supervised classification of radar images, work on validation models, the 
development of a three-component scattering model, and the development of a vegetation map 
classification (MAPVEG) expert system. 


Dr. Freeman described the three -component scattering model and several scattering mechanisms. 
The project developed a technique of estimating the contribution of each of the three mechanisms 
and tried to develop an overall classifier. They developed some classification rules, as part of 
the expert system, and came up with a classifier for the three images trained on (Netherlands 
farmland, a rain forest, and Black Forest/farmland). The classifier was then run with other 
images, and did a very good representation. Unsupervised classification can be applied to any 
calibrated three-frequency AIRSAR data, but it is not designed for very low incidence angles. 


Dr. Freeman discussed the software that has been developed in this program. MacSigma 0, 
released through COSMIC, is for display and analysis of radar images and has export capabilities 
designed into the software. MAPVEG, which contains the classifier, takes AIRSAR data and 
produces a vegetation map. RAVEN, currently under development, performs display and analysis 
of radar images on UNIX. Image registration software is planned. MAC software and 
documentation is to be incorporated into the SIR-C education CD. 


23 



August 3-6, 1993 

A1SRP Workshop III 

F.n vision: A for Management and Display of Large Data Sets 

Dr. Kenneth Bowman 
Texas A&M University 

Envision integrates data management, manipulation, analysis, and display functions into a single 
interactive environment. It uses standard portable data storage and management tools to provide 
access to multi-GB data sets and provides a simple, intuitive, collaborative, and portable user 
interface. Envision will enhance the capabilities of existing interactive visualization software 

developed at the NCSA. 

The Envision Data Manager consists of a data server and maintains the project file and data 
storage The user interface provides a table-like display of meta data. The initial release uses 
NCSA Xlmage, NCSA Collage, and IDL. The project is planning to complete some functions 
in the Data Manager, but is now focussing on making it easier for people to connect their own 

tools and software. 

a brief demonstration of Envision, which is available through anonymous ftp. 


Dr. Bowman gave 


The meeting adjourned to the demonstration session. 


Friday, August 6, 1993 


SESSION IV - RESEARCH AND TECHNOLOGY 
Chair: Dr. Amy Walton 


Research and Technolog y Activities at PL 

Dr. Amy Walton 

Jet Propulsion Laboratory 


Dr Walton discussed the current research and technology activities at JPL. The scope o e 
activities include: the development and exercise of tools that make use of visual perception o 
integrate and display multiple diverse data sets, explore and validate data sets, track and measure 
features and dynamics, provide perspective simulations, and generate scientis controlled 
animations- the automation and acceleration of analysis of large complex data sets, the 
development of interactive capabilities; the extension of capabilities to a heterogeneous 
computing environment; the incorporation of techniques that make the environment self-training 
and extensible; and the migration of developed tools into the science environment. 
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JPL is currently working on improvements in data access, such as data storage options, ancillary 
data, and standards/data interoperability. JPL also is working on integration and visualization 
testbeds and techniques that specifically support data analysis. Some of these projects are: 
imaging methods for multi-dimensional data visualization, graphical methods for science data 
analysis, scientific tools for JPL image archives, and an integrated science analysis testbed. Dr. 
Walton discussed each of these activities. 

Future directions are towards "desktop science", which will be a move from timesharing/LAN 
to WAN and will involve the use of smaller local machines to access varied computing resources 
with the ability to use varied data formats, multi-vendor hardware, and software. The ability to 
rapidly evaluate large data volumes, complex data sets, and combinations of data sets has become 
a major need in the scientific research process. Dr. Walton emphasized the importance of 
people transfer", or working together, as a key aspect of technology transfer. 


Navigati on Ancillary Information Facility: An Overview of SPICE 

Dr. Amy Walton 

JPL 

There are two kinds of space science data-science instrument data and ancillary engineering data. 
SPICE (Spacecraft Planet Instrument C-matrix Events) addresses ancillary engineering data, 
which can be from the spacecraft, mission control center, and scientists. Ancillary data are data 
that tell when an instrument was taking data, where the spacecraft was located, how the 
spacecraft and its instruments were oriented, and what was the size, shape, and orientation of the 
targets being observed. It can also tell how the instrument was acquiring the data and what else 
was happening on the spacecraft or in the ground data system. 

The principal SPICE system components are data files and software, plus standards, 
documentation, user support, and system maintenance. It was originally intended for space 
science data analysis, but is now also being used for science observation planning and mission 
evaluation from a science perspective. SPICE is currently being used on Voyager, Magellan, and 
Galileo. It is being planned for Hubble Space Telescope and other flight project and terrestrial 
programs. Possible future applications include MESUR, the Discovery Program, AXAF, EOS, 
and INTERBOL. In addition, SPICE can be used as a component of a space science education 
program at the university level. 
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Using the NAIF SPICE Kernel Concepts and the NAI F Toolkit Software for Geometry Parameter 

Generation and Observation Visualization 

Dr. Karen Simmons 

LASP, University of Colorado 


Dr. Simmons provided some historical background on why a program like SPICE was needed 
and how SPICE is being used. Better navigation data was needed because the Supplementa 
Experiment Data Records (SEDRs) were static products and had conceptual flaws with respect 
to spacecraft navigation, pointing, and science instrument design improvements. A working 
group came up with a better way of doing this: SPICE Kemels-independent pieces of 

knowledge that would contain 99.9% of the information that is wanted on what the data is and 
how the data is being collected. Dr. Simmons discussed the advantages of the SPICE Kernels^ 
Kernels created the need for standardized parameter definition and generation. The SPICE 
Toolkit helps uniform understanding and use of geometry parameters. 


The Geometry and Graphics Software (GGS) provides the expertise to use both the Kernels and 
the Toolkit. GGS is a tool which allows the scientist to understand the complex geometric 
environment in which a data set was obtained. GGS provides geometry parameters via active 
display, hardcopy, or footprint files, and provides science observation visualization via after-the- 
fact animation of the commanded sequence, opportunity investigation, and planning and design. 
GGS maintains and documents expertise both actively (mostly through Kernels) and passive y. 


Two types of Kernels represent all the knowledge needed about how data was collected. The 
SPICE Kernels are either I-Kemel (instrument related) or E-Kemel (event, experimenter, or 
expert system related). SPICE is an exceptionally viable system. However the Kernel 
generation history documentation needs improvement (guidelines are needed), and the E-Kernels, 
although they represent a vital link, are still in development. 

The project has developed a number of versions of GGS, and has learned enough about how it 
works to make it a multi-mission tool. However, at present there is no way of handling 
differences among platforms. IDL was chosen for the scientific interaction with data and 
geometry because copies need to be propagated across different versions. However, as new IDL 
versions are released, the software must be kept updated. Future work can include: more design- 
side tools; C- Kernel Smithing with non-imaging data; tour analysis tools; and investigation o 
other uses, such as EOS, a PDS archive tool, and marriage with a sequence planning tool like 

OASIS. 


A captive guest account has been established to provide a demo of the GGS software package. 
Details are included in the presentation material in Appendix E. 
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The EOS/Pathfinder Interuse Experiment 
Dr. Mike Botts 

University of Alabama in Huntsville 

Interuse signifies the ease with which data sets from various sensors and disciplines can be 
brought together, coregistered in space and time, correlated, analyzed, and visualized together 
within scientific tools. The present EOS direction of relying heavily on previously gridded and 
projected Level 3 data is inadequate for several reasons. Dr. Botts described the experiment plan. 
The new approach. Interuse, focusses on making Level 2 data more useful. However, several 
issues must be resolved before an easily used Level 2 data set can be produced and distributed: 
generic navigation for EOS data; generic gridding and projection routines; incorporation of 
navigation information into HDF; data subsetting and compression; and the availability and 
compatibility of analysis and visualization tools. A number of key pieces could help with this 
experiment: SPICE, SPICEb and OoSPICE, PLATO, and extensible visualization tools such as 
LinkWinds, IDL/PV-Wave, AVS, Explorer, and Khoros. The project is looking at a modification 
of some of these tools that will allow navigation within the tool itself. 

Two Interuse Teams have been established. The Interuse Tiger Team will be developing some 
of the elements and focussing the activity. The Interuse Core Working Group, which includes 
scientists and data producers, will provide feedback, checkout, test, etc. Dr. Botts discussed the 
schedule for the experiment. During Phase I, the project will work through the issues. Phase 
II will consist of generation of data sets and distribution. Phase III will include assessment and 
refinement. 


Overview of Ames Research Center Advanced Network Applications 

Dr. John Yin 

Ames Research Center 

Dr. Yin described the functions of several elements of the ARC organization that are involved 
in advanced network applications. The Advanced Network Applications (ANA) Section at ARC 
provides network based solutions for NASA and other federal agencies through 
the development, implementation, and operations coordination and support of extensible high 
technology network services and applications. The Network Applications Information Center 
(NAIC), part of the ANA section, provides NASA-wide operations support of advanced network 
services for the NASA science and research community. The Network Services Development 
Group develops, implements, and transitions to the NAIC state-of-the-art standards-based network 
applications and services. 


Dr. Yin discussed some of the ANA development activities in detail. The X.400/SMTP 
electronic mail gateways provide an agency wide electronic messaging infrastructure which 
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allows for completely transparent document exchanges between all host and LAN based mailers. 
The distributed X.500 directory service allows for the management and distribution of an 
electronic locator service common to all NASA sites. The electronic signatures and certificate 
management is a joint activity with the NASA Headquarters Office of the Comptroller, NIST 
and USPS to develop and demonstrate the generation, certification, management, and use o 
electronic signatures for authenticating electronic documents. Privacy enhanced mail provides 
a standards based end-to-end encryption of electronic messages. Packet video development and 
deployment is a project sponsored by the NASA Headquarters Office of Aeronautics HPC 
program for providing interoperable color packet video to UNIX workstations, PCs, and Macs 
by the end of FY 1994, with initial demonstration on Sun workstations by the end ol rY lw. 
Wireless network development will demonstrate high speed wireless connectivity to portable 
computers with special emphasis on support for advanced applications like packet video. 

The participants discussed network accessibility, particularly Internet, which is still not widely 
used in the administrative and business sectors. Other issues discussed were funding tor 
connection to Internet for the science community, and a "dial-up" capability for NASA science 
Internet Mr Mucklow emphasized that NASA encourages the use of networks for science 
collaboration, as reflected in the latest NASA Research Announcement (NRA). The NRA ^was 
issued electronically, and the evaluation process has been set up electronically. When NASA has 
electronic signature capability, proposals may be submitted electronically as well. Mr. Mucklow 
noted that all electronic proposal documentation should be in Postscript or Acrobat rather than 

ASCII format. 


Summary and Action Items 

Mr. Mucklow invited Dr. Jim Dodge from the Mission To Planet Earth (MTPE) Office to share 
some of his thoughts on the AISRP and the Workshop. Dr. Dodge indicated that there were a 
lot of good elements demonstrated at the Workshop that he will be taking back to share with the 
Global Change Data Analysis Program. He particularly praised the tools for data 
handling/merging and tools for visualization and data fusion. He suggested that an article be 
written summarizing the tools that have been developed or are being developed. Accessing and 
analyzing extremely large amounts of data will be needed in EOS, and this aspect did not appear 
to be addressed fully in the Workshop presentations and demonstrations. Other challenges which 
need to be addressed are a comprehensive approach to error tracking and accuracy (e.g., an error 
tracking system on a pixel basis), and data compression, which will continue to be an issue with 
researchers. Dr. Simmons suggested that the process that the Galileo scientists had to go through 
in making decisions regarding compression (due to the antenna problem) might be of value to 

the EOS scientists. 
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One of the advantages of the AISRP is that it is multi-disciplinary, and scientists have an 
opportunity to learn from each other and exchange tools and software. Tools will often reveal 
problems in data and models, and programs need to take care that the data sets and models that 
are generated have high quality control. The desktop computer to supercomputer via networks 
is the new paradigm, and requirements are increasing while resources are decreasing. 

The group discussed the challenges in getting new tools into the hands of more scientists. There 
was consensus that this needs to be pushed more through the discipline technical societies, and 
this action was given to the scientific participants. A key element is to select partner users that 
are well respected and produce first-class science. If the groups using the tools are well 
respected in the science community and produce notable science results, then others in that 
community will be highly motivated to use the tools. 

One of the issues is support for users when a tool becomes successful. Mr. Bredekamp indicated 
that in the current environment of shrinking resources, developers need to find clever ways to 
minimize the need for support, and design tools that are very easy to use. NASA and the science 
community cannot continue to do software maintenance as has been traditionally done. One 
positive aspect that will help with this challenge is the continued collaboration and union of the 
computer science community with Earth and space scientists, which has been emphasized in the 
AISRP. 

Other specific needs include more work in the algorithm area, and tools that use AI techniques 
to search data in extremely large data sets. Before adjourning the Workshop, Mr. Mucklow noted 
that the ISB has the action item to send out the Integrated Technology Plan to the Workshop 
attendees, and the AISRP investigators have the action to get started earlier this year with the 
AGU on a special session. 
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Meeting Chair: Mr. Glenn H. Mucklow/NASA Office of Space Science 
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Mr. Joseph Bredekamp/NASA Office of Space Science 

Software Support Laboratory and data Sampler CD-ROM 
Dr. Randy Davis, University of Colorado 


SESSION I - ARTIFICIAL INTELLIGENCE TECHNIQUES 

Chair: Richard Keller 

9:30 Announcements 

9:45 Multi-Layer Holographic Bifurcative Neural Network Systems for Real- 

Time Adaptive EOS Data Analysis 
Dr. Hua-Kuang Liu, JPL 

10:15 Break 

10:30 System of Experts for Intelligent Data Management (SEIDAM) 

Dr. David Goodenough, Energy Mines & Resources 

1 1 .00 Construction of an Advanced Software Tool for Planetary Atmospheric 
Modeling 

Dr. Richard Keller, ARC 

1 1 :30 Multivariate Statistical Analysis SW Technologies for Astrophysics 

Research Involving Large Data Bases 
Dr. George Djorgovski, California Institute of Technology 

12:00 Lunch 

1:30 SESSION I DISCUSSIONS 

2:30 DEMONSTRATIONS 
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ADJOURN to Reception 
Reception at NCAR 
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1:30 
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3:15 

3:45 

4:15 


Coffee 

II- SCIENTIFIC VISUALIZATION Chair: Theo Pavlidis 
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The Grid Analysis and Display System (GRADS): A Practical Tool for 

Earth Science Visualization 

Dr. James Kinter, University of Maryland 

A Distributed Analysis and Visualization System for Model and 

Observational Data 

Dr. Robert Wilhelmson, NCSA 

An Interactive Interface for NCAR Graphics 
Dr. Robert Lackman, NCAR 
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Topography from Shading and Stereo 
Dr. Berthold Horn, MIT 

Planetary Data Analysis and Display System: A Version of PC-MclDAS 
Dr. Sanjay Limaye, University of Wisconsin 

SESSION II DISCUSSIONS A 

LUNCH 

Experimenter's Laboratory for Visualized Interactive Science 
Dr. Elaine Hansen, University of Colorado 

SAVS: A Space Analysis and Visualization System 
Dr. Edward Szuszczewicz, SAIC 

LinkWinds: A Distributed System for Visualizing and Analyzing 
Multivariate and Multidisciplinary Data 
Dr. Allan Jacobson, JPL 
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DataHub: Knowledge-based Assistance for Science Visualization and 
Analysis Using Large Distributed Databases 
Mr. Thomas Handley, JPL 

Advanced Data Visualization and Sensor Fusion 
Mr. Vance McCullough, Hughes Aircraft Co. 

SESSION II DISCUSSIONS B 


5:00 ADJOURN 
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Dr. James Storer, Brandeis University 
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Dr. Nicholas Roussopoulos, University of Maryland 

9.45 A Spatial Analysis and Modeling System for Environment Management 
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Dr. Supriya Chakrabarti, Boston University 
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Resolution Remote Sensing and Ground Truth Data 
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Dr. Amy Walton, JPL 
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Dr. Chuck Acton, JPL 
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for Geometry Parameter Generation 
Dr. Karen Simmons, University of Colorado 

10:15 Break 

10:30 Advanced Network Applications 

Mr. John Yin, ARC 

1 1 :00 Splinter Group Reports 

1 1 :30 Summary and Action Items 

12:00 END OF WORKSHOP III 
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An Interactive Environment for the Analysis of 
Large Earth Observation and Model Data Sets 

L / 


Principal Investigator: Assistant Professor Kenneth P. Bowman 

University of Illinois 

Co-Investigators: Professor John E. Walsh 

University of Illinois 

Professor Robert B. Wilhelmson 
University of Illinois 

Summary: 

We propose to develop an interactive environment for the analysis of large Earth science 
observation and model data sets. We will use a standard scientific data storage format and a 
large capacity (>20 GB) optical disk system for data management; develop libraries for 
coordinate transformation and regridding of data sets; modify the NCSA X Image and X 
DataSlice software for typical Earth observation data sets by including map transformations 
and missing data handling; develop analysis tools for common mathematical and statistical 
operations; integrate the components described above into a system for the analysis and 
comparison of observations and model results; and distribute software and documentation to 
the scientific community. 
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Interactive Interface for National Center 
for Atmospheric Research (NCAR) Graphics 


Principal Investigator: Dr. William Buzbee 

National Center for Atmospheric Research 

Co-Investigators: Robert L. Lackman 

National Center for Atmospheric Research 


Summary: 

NCAR Graphics is a FORTRAN 77 library of over 30 high-level graphics modules which are 
heavily used by science and engineering researchers at over 1500 sites world- wide including 
many universities and government agencies. These Earth science oriented modules now have 
a FORTRAN callable subroutine interface which excludes their use by non-programming 
researchers. This proposal outlines the development of a fully interactive "point and click" 
menu-based interface using the prevailing toolkit standard for the X-Window System. 
Options for direct output to the display window and/or output to a Computer Graphics 
Metafile (CGM) will be provided. X, PEX, and PHIGS will be implemented as the 
underlying windowing and graphics standards. Associated meteorological and geometric data 
sets would exploit the network extended NASA Common Data Format, netCDF. 
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Development of a Tool-Set for Simultaneous, 
Multi-Site Observations of Astronomical Objects 


Principal Investigator: Dr. Supriya Chakrabarti 

Boston University 

Co-Investigators: Dr. J. Garrett Jemigan 

University of California, Berkeley 

Dr. Herman L. Marshall 
University of California, Berkeley 





Summary 

A network of ground and space based telescopes can provide continuous observation of 
astronomical objects. In a "Target of Opportunity" scenario triggered by the system, any 
telescope on the network may request supporting observations. We propose to develop a set 
of data collection and display tools to support these observations. We plan to demonstrate the 
usefulness of this toolset for simultaneous multi-site observations of astronomical targets. 
Possible candidates for the proposed demonstration include the Extreme Ultraviolet Explorer, 
International Ultraviolet Explorer, ALEXIS, and sounding rocket experiments. Ground based 
observations operated by the University of California, Berkeley; the Jet Propulsion 
Laboratory; and Fairborn Observatory, Mesa, Arizona will be used to demonstrate the 
proposed concept. Although the demonstration will involve astronomical investigations, these 
tools will be applicable to a large number ot scientific disciplines. The software tools and 
systems developed as a result ot our work will be made available to the scientific community. 
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Multivariate Statistical Analysis Software Technologies for 
Astrophysical Research Involving Large Data Bases 


Principal Investigator: Professor Stanislav Djorgovski 

California Institute of Technology 


Summary: 

The existing and forthcoming data bases from NASA missions contain an abundance of 
information whose complexity cannot be efficiently tapped with simple statistical techniques. 
Powerful multivariate statistical methods already exist which can be used to harness much ot 
the richness of these data. Automatic classification techniques have been developed to solve 
the problem of identifying known types of objects in multiparameter data sets, in addition to 
leading to the discovery of new physical phenomena and classes of objects. We propose an 
exploratory study and integration of promising techniques in the development of a general 
and modular classification/analysis system for very large data bases, which would enhance 
and optimize data management and the use of human research resources. 
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A Land-Surface Testbed for EOSDIS / 


Principal Investigator: Dr. William Emery 

University of Colorado 

Co-Investigators: Dr. Jeff Dozier 

University of California, Santa Barbara 

Paul Rotar 

National Center for Atmospheric Research 


Summary: 

We propose to develop an on-line data distribution and interactive display system for the 
collection, archival, distribution and analysis of operational weather satellite data for 
applications in land surface studies. A 1,000 km2 scene of the western U.S. (centered on the 
Colorado Rockies) will be extracted from Advanced Very High Resolution Radiometer 
(AVHRR) imagery collected from morning and afternoon passes of the NOAA polar- orbiters 
at the direct readout stations operated by CU/CCAR. All five channels of these AVHRR data 
will be navigated and map registered at CU/CCAR and then be transferred to NCAR for 
storage in an on-line data system. Software will also be available at NCAR to process and 
navigate the raw AVHRR data as needed. A display workstation software, based on a 
Macintosh II computer, will be developed that will display and further process the AVHRR 
data for studies of vegetation monitoring and snowpack assessment. Various options of 
presently used techniques for both vegetation and snowpack monitoring will be implemented 
in the workstation software to provide the individual investigator with the freedom to interact 
with the satellite image data. The display software will be freely distributed online to 
interested investigators and the AVHRR data will be made available on-line to anyone 
interested. In addition, potential users will be sought out and connected to the on-line data 
archive. This experiment with an active on-line archive and interactive analysis systems will 
provide experience with a small scale EOSDIS. 
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Geographic Information System for Fusion and Analysis of 
High-Resolution Remote Sensing and Ground Truth Data 


Principal Investigator: 
Co- Investigators : 


Anthony Freeman 

Jet Propulsion Laboratory 

Jo Bea Way 

Jet Propulsion Laboratory 

Pascale Du Bois 

Jet Propulsion Laboratory 

Franz Leberl 
VEXCEL Corporation 


Summary: 

We seek to combine high-resolution remotely sensed data with models and ground truth 
measurements, in the context of a Geographical Infonnation System, integrated with 
specialized image processing software. We will use this integrated system to analyrethedata 
from two Case Studies, one at a boreal forest site, the other a tropical forest site. We w 
assess the information content of the different components of the data, determine the optimum 
data combinations to study biogeophysical changes in the forest, assess the best way to 
visualize the results, and validate the models for the forest response to different radar 
wavelengths/polarizations. 

During the 1990’s, unprecedented amounts of high-resolution images from spaceof ^e 
Earth’s surface will become available to the applications scientist from the LANDS AT/TM 
series European and Japanese ERS-1 satellites, RADARS AT and SIR-C missions When the 
Earth Observation Systems (EOS) program is operational, the amount of data available for a 
particular site can only increase. The interdisciplinary scientist, seeking to use data from 
various sensors to study his site of interest, may be faced with massive difficulties m 
manipulating such large data sets, assessing their information content, determining the 
optimum combinations of data to study a particular parameter, visualizing his results and 
validating his model of the surface. The techniques to deal with these problems are also 
needed to support the analysis of data from NASA’s current program of Multi-sensor 
Airborne Campaigns, which will also generate large volumes of data. 


In the Case Studies outlined in this proposal, we will have somewhat unique data sets. For 
the Bonanza Creek Experimental Forest (Case I) calibrated DC-8 SAR data and extensive 
ground truth measurement are already at our disposal. The data set shows documented 
evidence to temporal change. The Belize Forest Experiment (Case II) will produce calibrated 
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DC-8 SAR and AVIRIS data, together with extensive measurements on the tropical rain forest 
itself. The extreme range of these sites, one an Arctic forest, the other a tropical rain forest, 
has been deliberately chosen to find common problems which can lead to generalized 
observations and unique problems with data which raise issues for the EOS System. 
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Construction of an Advanced Software Tool 
j \ for Planetary Atmospheric Modeling 

Principal Investigator: Dr. Peter Friedland 

Ames Research Center 


Co-Investigators: Dr. Richard M. Keller 

Ames Research Center 

Dr. Christopher P. McKay 
Ames Research Center 

Michael H. Sims 
Ames Research Center 

Dr. David E. Thompson 
Ames Research Center 


Summary: 

Scientific model-building can be a time intensive and painstaking process, often involving the 
development of large complex computer programs. Despite the effort involved, scientific 
models cannot be distributed easily and shared with other scientists. In general, implemented 
scientific models are complicated, idiosyncratic, and difficult for anyone but the original 
scientist/programmer to understand. We propose to construct a scientific modeling software 
tool that serves as an aid to the scientist in developing, using and sharing models. The 
proposed tool will include an interactive intelligent graphical interface and a high-level 
domain-specific modeling language. As a testbed for this research, we propose to develop a 
software prototype in the domain of planetary atmospheric modeling. 
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System of Experts for Intelligent Data Management (SEIDAM) 


Principal Investigator: Dr. David G. Goodenough 

Canada Centre for Remote Sensing (CCRS) ; / 

Co-Investigators: Joji Iisaka 

Canada Centre for Remote Sensing 

Ko Fung 

University of Ottawa 


Summary: 

It is proposed to conduct research and development on a system of expert systems for 
intelligent data management (SEIDAM). CCRS has much expertise in developing systems for 
integrating geographic information with space and aircraft remote sensing data and in 
managing large archives of remotely sensed data. SEIDAM will be composed of expert 
systems grouped in three levels. At the lowest level, the expert systems will manage and 
integrate data from diverse sources, taking account of symbolic representation differences and 
varying accuracies. Existing software can be controlled by these expert systems, without 
rewriting existing software into an Artificial Intelligence (AI) language. At the second level, 
SEIDAM will take the interpreted data (symbolic and numerical) and combine these with data 
models. At the top level, SEIDAM will respond to user goals for predictive outcomes given 
existing data. The SEIDAM Project will address the research areas of expert systems, data 
management, storage and retrieval, and user access and interfaces. 
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Knowledge-based Assistance for Science Visualization 
and Analysis Using Large Distributed Databases 


Principal Investigator: Thomas H. Handley, Jr. 

Jet Propulsion Laboratory 

Co-Investigators: Dr. Allan S. Jacobson 

Jet Propulsion Laboratory 

Dr. Richard J. Doyle 
Jet Propulsion Laboratory 

Dr. Donald J. Collins 
Jet Propulsion Laboratory 


Summary: 

Within this decade, the growth in complexity of exploratory data analysis and the sheer 
volume of space data require new and innovative approaches to support science mvestiga ors 
in achieving their research objectives. To date, there have been numerous efforts addressing 
the individual issues involved in inter-disciplinary, multi-instrument investigations. However, 
while successful in small scale, these efforts have not proven to be open and scaleable. 

This proposal addresses four areas of significant need: scientific visualization and analysis; 
science data management; interactions in a distributed, heterogeneous environment, and 
knowledge-based assistance for these functions. The fundamental innovation embedded 
within this proposal is the integration of three automation technologies, namdy, knowledge- 
based expert systems, science visualization and science data management. This integration is 
S onTe concept called the DacaHub. With the Da.aHub concept NASA will be able to 
apply a more complete solution to all nodes of a distributed system. Both computation nodes 
and interactive nodes will be able to effectively and efficiently use the data services (access, 
retrieval update, etc.) with a distributed, interdisciplinary information system in a uniform 
and standard way. This will allow the science investigators to concentrate on their scientific 
endeavors rather than to involve themselves in the intricate technical details of the systems 
and tools required to accomplish their work. Thus, science investigators need not be 
programmers. The emphasis will be on the definition and prototyping of system elements 
with sufficient detail to enable data analysis and interpretation leading to publishable scientific 
results. In addition, the proposed work includes all the required end-to-end components and 
interfaces to demonstrate the completed concept. 
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Experimenter’s Laboratory for Visualized Interactive Science 

i „ 


Principal Investigator: Elaine R. Hansen 

University of Colorado at Boulder 

Co-Investigators: Marjorie K. Klemp 

University of Colorado at Boulder 

Sally W. Lasater 

University of Colorado at Boulder 

Marti R. Szczur 
Goddard Space Flight Center 

Joseph B. Klemp 

National Center for Atmospheric Research 

Summary: 

The science activities of the 1990’s will require the analysis of complex phenomena and large 
diverse sets of data. In order to meet these needs, we must take advantage of advanced user 
interaction techniques: modem user interface tools; visualization capabilities; affordable, high 
performance graphics workstations; and interoperable data standards and translator. To meet 
these needs, we propose to adopt and upgrade several existing tools and systems to create an 
experimenter s laboratory for visualized interactive science. Intuitive human-computer 
interaction techniques have already been developed and demonstrated at the University of 
Colorado. A Transportable Applications Executive (TAE+), developed at GSFC, is a 
powerful user interface tool for general purpose applications. A 3D visualization package 
developed by NCAR provides both color-shaded surface displays and volumetric rendering in 
either index or true color. The Network Common Data Form (NetCDF) data access library 
developed by Unidata supports creation, access and sharing of scientific data in a form that is 
self-describing and network transparent. The combination and enhancement of these packages 
constitutes a powerful experimenter’s laboratory capable of meeting key science needs of the 

1990 s. This proposal encompasses the work required to build and demonstrate this 
capability. 
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N95- 34229 


J A y 


Topography from Shading and Stereo 


Principal Investigator: 
Co-Investigators : 


Professor Berthold P. Horn 
Massachusetts Institute of Technology 

Michael Caplinger 
Arizona State University 


Summary: 

Methods exploiting photometric information in images that have been developed in machine 
vision can be applied to planetary imagery. Present techniques, however, focus on one visual 
cue such as shading or binocular stereo, and produce results that are 

in an absolute sense or provide information only at few points on the surface. We plan to 
integrate shape from shading, binocular stereo and photometric stereo to yield a robust system 
fo^recm^erin^ detailed surface shape and surface reflectance information. Such a system will 
be useful in producing quantitative information from the vast volume of imagery being 
drived u well as in helping visualize the underlying surface. The work will be earned out 
on a popular computing platform so that it will be easily accessible to other workers. 


N95- 34230 


A Distributed System for Visualizing and Analyzing 
Multivariate and Multidisciplinary Data 

Principal Investigator: Dr. Allan S. Jacobson 

Jet Propulsion Laboratory 

Co-Investigators: Dr. Mark Allen 

Dr. Michael Bailey 
Dr. Ronald Blom 
Leo Blume 
Dr. Lee Elson 

[all from Jet Propulsion Laboratory] 



Summary: 

The Linked Windows Interactive Data System (LinkWinds) is being developed with NASA 
support. The objective of this proposal is to adapt and apply that system in a complex 
network environment containing elements to be found by scientists working multidisciplinary 
teams on very large scale and distributed data sets. The proposed three year program will 

Speciflc vlsuallzat ion and analysis tools, to be exercised locally and remotely in the 
LinkWinds environment, to demonstrate visual data analysis, interdisciplinary data analysis 
and cooperative and interactive televisualization and analysis of data by geographically 

separated science teams. These demonstrations will involve at least two science disciplines 
with the aim of producing publishable results. P 


l 
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N95- 34231 


The Grid Analysis and Display System (GRADS): 
A Practical Tool for Earth Science Visualization 


Principal Investigator: 


Dr. James L. Kinter, III 
University of Maryland 


Summary: 

We propose to develop and enhance a workstation based grid analysis and display software 
system for Earth science dataset browsing, sampling and manipulation. The system will be 
coupled to a supercomputer in a distributed computing environment for near real-time 
interaction between scientists and computational results. 
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Planetary Data Analysis and Display System: 

A Version of PC-McIDAS 



Principal Investigator: Dr. Sanjay S. Limaye 

University of Wisconsin-Madison 

Co-Investigators: L. A. Sromovsky 

University of Wisconsin-Madison 

R. S. Saunders 

Jet Propulsion Laboratory 

Michael Martin 

Jet Propulsion Laboratory 

Summary: 

We propose to develop a system for access and analysis of planetary data from past and 
future space missions based on an existing system, the PC-McIDAS workstation. This system 
is now in use in the atmospheric science community for access to meteorological satellite and 
conventional weather data. The proposed system would be usable by not only planetary 
atmospheric researchers but also by the planetary geologic community. By providing the 
critical tools of an efficient system architecture, newer applications and customized user 
interfaces can be added by the end user within such a system. 


I / 
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N95- 34233 
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Multi-Layer Holographic Bifurcative Neural Network 
System for Real-Time Adaptive EOS Data Analysis 


Principal Investigator: Dr. Hua-Kuang Liu 

Jet Propulsion Laboratory 


Co- Investigators: 


Professor K. Huang 
University of Southern California 


J. Diep 

Jet Propulsion Laboratory 


Summary: 


Optical data processing techniques have the inherent advantage of high data throughout, low 
weight and low power requirements. These features are particularly desirable for onboard 
spacecraft in-situ real-time data analysis and data compression applications The proposed 
multi-layer optical holographic neural net pattern recognition technique will utilize the 
nonlinear photorefractive devices for real-time adaptive learning to classify input data content 
and recognize unexpected features. Information can be stored either m analog or digital form 
in a nonlinear photofractive device. The recording can be accomplished in time scales 
ranging from milliseconds to microseconds. When a system consisting of these devices is 
organized in a multi-layer structure, a feedforward neural net with bifurcating data 
classification capability is formed. The interdisciplinary research will involve the 
collaboration with top digital computer architecture experts at the University of Southern 


California. 
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Development of an Expert Data Reduction Assistant 


Principal Investigator: Dr. Glenn E. Miller 

Space Telescope Science Institute 


Co-Investigators: Dr. Mark D. Johnston 

Space Telescope Science Institute 

Dr. Robert J. Hanisch 

Space Telescope Science Institute 


Summary: 

We propose the development of an expert system tool for the management and reduction of 
complex data sets. The proposed work is an extension of a successful prototype system for 
the calibration of CCD images developed by Dr. Johnston in 1987. (ref.: Proceedings of the 
Goddard Conference on Space Applications of Artificial Intelligence) 

The reduction of complex multi-parameter data sets presents severe challenges to a scientist. 
Not only must a particular data analysis system be mastered, (e.g. IRAF/SDAS/MIDAS), 
large amounts of data can require many days of tedious work and supervision by the scientist 
for even the most straightforward reductions. The proposed Expert Data Reduction Assistant 
will help the scientist overcome these obstacles by developing a reduction plan based on the 
data at hand and producing a script for the reduction of the data in a target common 
language. 
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VIEWCACHE: An Incremental Pointer-based Access 
Method for Autonomous Interoperable Databases 


Principal Investigator: Associate Professor N. Roussopoulos 

University of Maryland 

Co-Investigators: Timos Sellis 

University of Maryland 


Summary: 


One of biggest problems facing NASA today is to provide scientists efficient access to a large 
number of distributed databases. Our pointer-based incremental database access method, 
VIEWCACHE, provides such an interface for accessing distributed datasets and directories. 
VIEWCACHE allows database browsing and search performing inter-database cross- 
referencing with no actual data movement between database sites. This organization and 
processing is especially suitable for managing Astrophysics databases which are physically 
distributed all over the world. Once the search is complete, the set of collected pointers 
pointing to the desired data are cached. VIEWCACHE includes spatial access methods for 
accessing image datasets, which provide much easier query formulation by referring directly 
to the image and very efficient search for objects contained within a two-dimensional 
window. We will develop and optimize a VIEWCACHE External Gateway Access to 
database management systems to facilitate distributed database search. 
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Advanced Data Visualization and Sensor Fusion: 
Conversion of Techniques from Medical Imaging to Earth Science 


/ 


Principal Investigator: Dr Richard C. Savage 

Hughes Aircraft Company 

Co-Investigators: Dr. Chin-Tu Chen 

University of Chicago 

Dr. Charles Pelizzari 
University of Chicago 

Dr. Veerabhadran Ramanathan 
University of Chicago 


Summary: 

Hughes Aircraft Company and the University of Chicago propose to transfer existing medical 
imaging registration algorithms to the area of multi-sensor data fusion. The University of 
Chicago’s algorithms have been successfully demonstrated to provide pixel by pixel 
comparison capability for medical sensors with different characteristics. The research will 
attempt to fuse GOES, AVHRR, and SSM/I sensor data which will benefit a wide range of 
researchers. 

The algorithms will utilize data visualization and algorithm development tools created by 
Hughes in its EOSDIS prototyping. This will maximize the work on the fusion algorithms 
since support software (e.g. input/output routines) will already exist. The research will 
produce a portable software library with documentation for use by other researchers. 



N95- 34237 


A 2~ 


/>■ 



High Performance Compression of Science Data 


Principal Investigator: 
Co-Investigators : 


Dr. James A. Storer 
Brandeis University 

Dr. Martin Cohn 
Brandeis University 


Summary: 

In the future, NASA expects to gather over a tera-byte per day of data requiring space for 
levels of archival storage. Data compression will be a key component in systems that store 
this data (e.g., optical disk and tape) as well as in communications systems (both between 
space and Earth and between scientific locations on Earth). We propose to develop 
algorithms that can be a basis for software and hardware systems that compress a wide 
variety of scientific data with different criteria for fidelity/bandwidth tradeoffs. The 
algorithmic approaches we consider are specially targeted for parallel computation where data 
rates of over 1 billion bits per second are achievable with current technology. 
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SAYS: A Space Analysis and Visualization System 
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Principal Investigator: Dr. Edward P. Szuszczewicz 

Science Applications International Corporation 

Co-Investigators: Dr. Alan Mankofsky 

Science Applications International Corporation 

Dr. Charles C. Goodrich 
University of Maryland 

Summary: 

We propose to develop, test, demonstrate, and deliver to NASA a powerful and versatile data 
acquisition, manipulation, analysis and visualization system which will enhance scientific 
capabilities in the display and interpretation of diverse and distributed data within an 
integrated user-friendly environment. Our approach exploits existing technologies and 
combines three major elements into an easy-to-use interactive package: 1) innovative 
visualization software, 2) advanced database techniques, and 3) a rich set of mathematical and 
image processing tools. Visualization capabilities will include one-, two-, and three- 
dimensional displays, along with animation, compression, warping and slicing functions. 
Analysis tools will include generic mathematical and statistical techniques along with the 
ability to use large scale models for interactive interpretation of large volume data sets. Our 
system will be implemented on Sun and DEC UNIX workstations and on the Stardent 
Graphics Supercomputer. Our final deliverable will include complete documentation and a 
NASA/NSF-CDAW/SUNDIAL campaign demonstration. 
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A Spatial Analysis and Modeling System (SAMS) 
\ for Environment Management 


Co-Investigators: Fran Stetina 

Goddard Space Flight Center 

Dr. John Hill 

Louisiana State University 
Dr. Paul Chan 

Science Systems and Applications, Inc. 
Robert Jaske 

Federal Emergency Management Agency 

Gilbert Rochon 
Dillard University 


Summary: 

This is a proposal to develop a uniform global environmental data gathering and distribution 
system to support the calibration and validation of remotely sensed data. SAMS is based on 
an enhanced version of FEMA’s Integrated Emergency Management Information Systems and 
the Department of Defense’s Air Land Battlefield Environment Software Systems. This 
system consists of state-of-the-art graphics and visualization techniques, simulation models, 
database management and expert systems for conducting environmental and disaster 
preparedness studies. This software package will be integrated into various Landsat and 
UNEP-GRID stations which are planned to become direct readout stations dunng the bUb 
timeframe This system would be implemented as a pilot program to support the Tropical 
Rainfall Measuring Mission (TRMM). This will be a joint NASA-FEMA-Umversity-Induslry 

project. 
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A Distributed Analysis and Visualization 
System for Model and Observational Data 


Principal Investigator: Professor Robert Wilhelmson 

University of Illinois 

Co-Investigators: Dr. Steven Koch 

Goddard Space Flight Center 


Summary: 

The objective of this proposal is to develop an integrated and distributed analysis and display 
software system which can be applied to all areas of the Earth System Science to study 
numerical model and earth observational data from storm to global scale. This system will be 
designed to be easy to use, portable, flexible and easily extensible and to adhere to current 
and emerging standards whenever possible. It will provide an environment for visualization 
of the massive amounts of data generated from satellites and other observational field 
measurements and from model simulations during or after their execution. Two- and three- 
dimensional animation will also be provided. This system will be based on a widely used 
software package from NASA called GEMPAK and prototype software for three-dimensional 
interactive displays built at NCSA. The underlying foundation of the system will be a set of 
software libraries which can be distributed across a UNIX based supercomputer and 
workstations. 
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APPENDIX D 


Demonstrations 




Demonstration Schedule for AISRP Workshop 


SGI Indigo 2 Extreme 

PATHFINDER 

3 - 3:40 

sgidemo2 

LinkWinds 

3:40-4:20 


Datahub 

4:20 - 5 

SGI Indigo 2 Extreme (Limaye)McIdas-eXplorer 
sgidemo 

3 - 5 

DECstation 

EOS testbed 

3 - 4 

nowhere 

Interactive NCAR Graphics 

4 - 5 

IBM RS6000 

Envision 

3 - 4 

rsdemo 

SAVS (Space Data Analysis 
and Visualization System) 

4 - 5 

SGI Indigo XS24 

PolyPaint+ 

3 - 4 

picasso2 

GRADS (Grid Analysis and 
Display System) 

4 - 5 

Sun 3 (use as X display for 

SIGMA (Scientists' Intelligent 

3 - 4 

sirius Sparcstation) 

Graphical Modeling Assistant) 



Software Support Laboratory 

4 - 5 

VAXstation 

GGGS (Galileo Geometry and 

3 - 5 

pup 

Graphics Software) 



Macintosh 


MacsigmaO 


4 - 5 



PATHFINDER . . . . c . f 

A Distributed System for the Visualization and Analysis ot 

Observed and Modeled Meterological Data 
PI: Dr. Robert Wilhelmson, Univ. of Illinois 

LinkWinds _ _ . 

LinkWinds: The Linked Windows Interactive Data System 

PI: Dr. Allan S. Jacobson, JPL 

Datahub ^ 

DataHub: Knowledge-Based Science Data Management 

PI: Tom Handley, JPL 

Mcldas-eXplorer , , ~ Dr 

Planetary Data Analysis and Display System: A Version or PC- 

McIDAS . „ r . . x . 

PI: Dr. Sanjay S. Limaye, University of Wisconsin-Madison 

EOS Testbed „ „ 

A Land-Surface Testbed for the EOS Data Information System 

/ CACT^TCJ 

PI: Dr. William Emery, Colorado Center for Astrodynamics 
Research 

Interactive NCAR Graphics 

Interactive NCAR Graphics 
PI: Bob Lackman, NCAR 

Envision . r T 

Envision: An Analysis and Display System for Large 

Geophysical Data Sets 

PI: Dr. Kenneth Bowman, Texas A&M 


SAVS 


SAVS: A Space Data Analysis and Visualization System 
PI: Dr. E. Szuszczewicz, Laboratory for Atmospheric and Space 
Science/SAIC 


PolyPaint+ 

Experimenter's Laboratory for Visualized Interactive Science 
PI: Elaine Hansen, University of Colorado 

GRADS 

The Grid Analysis and Display System (GrADS) 

PI: Dr. James L. Kinter, Center for Ocean-Land-Atmosphere 
Interactions 

University of Maryland 

MacSigmaO 

Geographic Information System (GIS) for Fusion and Analysis 
of High-Resolution Remote Sensing and Ground Truth Data 
PI: Dr. A. Freeman, JPL 


Software Support Laboratory 

PI: Randy Davis, University of Colorado/LASP 


GGGS 

Galileo Geometry and Graphics Software 

Karen Simmons, Kirk Benell, University of Colorado/LASP 

SIGMA 

Scientists' Intelligent Graphical Modeling Assistant 
PI: Dr. Richard Keller, Ames Research Center 
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SGI Indigo 2 Extreme (Limaye)McIdas-eXplorer 
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3 - 4 

nowhere 
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3 - 4 
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SAVS (Space Data Analysis 
and Visualization System) 

4 - 5 

SGI Indigo XS24 

Poly Pain t+ 

3 - 4 
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GRADS (Grid Analysis and 
Display System) 
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Sun 3 (use as X display for 

SIGMA (Scientists' Intelligent 

3 - 4 

sirius Sparcstation) 

Graphical Modeling Assistant) 



Software Support Laboratory 
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GGGS (Galileo Geometry and 

3 - 5 
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Macintosh 
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4 - 5 
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A Distributed System for the Visualization and Analysis of 

Observed and Modeled Meterological Data 
PI: Dr. Robert Wilhelmson, Univ. of Illinois 

LinkWinds _ _ „ 

LinkWinds: The Linked Windows Interactive Data System 

PI: Dr. Allan S. Jacobson, JPL 


Datahub _ 

DataHub: Knowledge-Based Science Data Management 

PI: Tom Handley, JPL 
Me Idas-eXplorer 

Planetary Data Analysis and Display System: A Version of PC- 

McIDAS . , iir . . x . 

PI: Dr. Sanjay S. Limaye, University of Wisconsin-Madison 


EOS Testbed T _ c __ 

A Land-Surface Testbed for the EOS Data Information System 

(EOSDIS) „ A 

PI: Dr. William Emery, Colorado Center for Astrodynamics 

Research 


Interactive NCAR Graphics 

Interactive NCAR Graphics 
PI: Bob Lackman, NCAR 


Envision 

Envision: An Analysis and Display System for Large 

Geophysical Data Sets 

PI: Dr. Kenneth Bowman, Texas A&M 


SAVS 

SAVS: A Space Data Analysis and Visualization System 
PI: Dr. E. Szuszczewicz, Laboratory for Atmospheric and Space 
Science/SAIC 


Poly Pain t+ 

Experimenter's Laboratory for Visualized Interactive Science 
PI: Elaine Hansen, University of Colorado 

GRADS 

The Grid Analysis and Display System (GrADS) 

PI: Dr. James L. Kinter, Center for Ocean-Land-Atmosphere 
Interactions 

University of Maryland 

MacSigmaO 

Geographic Information System (GIS) for Fusion and Analysis 
of High-Resolution Remote Sensing and Ground Truth Data 
PI: Dr. A. Freeman, JPL 


Software Support Laboratory 

PI: Randy Davis, University of Colorado/LASP 


GGGS 

Galileo Geometry and Graphics Software 

Karen Simmons, Kirk Benell, University of Colorado/LASP 

SIGMA 

Scientists' Intelligent Graphical Modeling Assistant 
PI: Dr. Richard Keller, Ames Research Center 
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SOFTWARE SUPPORT LABORATORY 
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What The SSL Would Like From You 



See Appendix Page E-48 for NCSA Mosaic 











HOLOGRAPHIC NEURAL NETWORKS HOLOGRAPHIC NEURAL NETWORKS 
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4 If we create new knowledge, how do we ensure that this new 
knowledge is consistent with existing knowledge in our 
system? 

+ How do we attach ratings to the new knowledge reflecting a 
level of certainty in these new rules? 

















LOCATION OF TEST SITES 
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NRCan. staff working with BCMOF created a System of 
Hierarchical Experts for Resource Inventories (SHERI) 
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OF POOR QUALITY 









. Model re-engineering and results replication. . Knowledge acquisition and maintenance 

• Llndal at al. 1983 (Voyager I data analysis) 

• Running and Coughlan 1988 (forestry data) . Scenario configuration 

. Model-building framework (modeling as planning) . Cont rol constructs 
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MULTIVARIATE STATISTICAL ANALYSIS 
SOFTWARE TECHNOLOGIES FOR 
ASTROPHYSICAL RESEARCH 
INVOLVING LARGE DATABASES 


Dr. S. Djorgovski 
California Institute of Technology 


August 3, 1993 






MULTIVARIATE STATISTICAL ANALYSIS SOFTWARE 

TECHNOLOGIES FOR ASTROPHYSICAL RESEARCH 

INVOLVING LARGE DATABASES 

PX - S. GEORGE DJORGOVSKI 

ASTRONOMY, MS 105-24, CALTECH, PASADENA, CA 91125 

EMAIL: george deimos.caltech.edu 

Ph. (818) 395-4415 FAX (818) 568-1517 

Two stages of the project: 

1. Development of STATPROG , a user-friendly, science-driven 
package for multivariate statistical analysis of (relatively) 
small data sets (typically 10,000 > data vectors, 30 > dims.) 

2. Development of SK1CAT (nee FRITZ), an Al-based system 
for processing and analysis of about 3 TB of digital image 
information from the Second Palomar Sky Survey, and 
other, present and future large astronomical data sets. 

SKICAT is a collaborative project between JPL 
(U. Fayyad, R. Doyle, et al.) and Caltech (N. Weir, 
S.G. Djorgovski, et al.) 

* The handout may contain additional explanatory material, 
not shown during the oral presentation 
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SKICAT 

Goals 

• Automatic and optimal extraction of image information 
for diverse scientific programs 

• Uniform object catalogs 

• Interactive catalog (and ultimately image) analysis 
Components 

• Image processing and reduction: 

pixels — catalogs 

• Object classification 

• Catalog matching and calibration: 

plate catalogs + CCD catalogs — survey catalogs 

• Catalog querying and analysis 

Present Implementation 

• Runs on Sparc II 

• FOCAS for image and initial catalog processing 

• for database maintenance 
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The Data 


poss-n 

• 3 bands: m»J (blue), UUF (red), »nd IV-N (near IR) 

• 6.5* squared, with 5.0* spacing 

• T limiting magnitudes are typically: 

Bj ~ 22.5, Rr ~ 21.5, and Is - 19* 5 

• Average seeing — 2".5 - 3".0 

CCD Calibrations 

• 3 bands: Gunn g , r, and t 

• T.tTnitrng magnitudes 1 — 2 mag fainter than plates 

• For photometric calibration and training/test data for 
object classification 

Scans 

• Using ST-ScI PDS 

• Scanning parameters: 

pixel (step) size: 15*im 
aperture: SOfsm 

• ~ 2700 images of 23,040 a pixels each 

• 2 byte digitization - 1 Gb / plate - 3 Tb total 

• Astrometric solution good to ~ 1" 



Current Ca talog Generation Scheme 


Footprint 


Digitized Photo Plate 



Catalog Entries 


( " " ^ 
IlllPg* Processing Steps: 

1 . Scan (digitize) photo plate. 

2. Detection: detect contiguous pixels 
in image that are to be grouped as one 
object (standard image processing). 

3 . Perform more accurate local sky 
determination for each detected object. 

4. Evaluate parameters for each object 
independently: 40 base-level attributes. 

5. Split objects that are "blended" 
together and re-evaluate attributes. 

6. AutoPSF: get sure-thing stars, 

form template. 


7 . Measure resolution scale and resolution 
fraction of each object: 

These are obtained by fitting to template ot 
sure-thing stars. 

8. Classify objects in image. 


V. 


J 


y 


All steps are automated except for step 6. 

Software used for image processing: FOCAS 

When all seven steps performed, we believe 
from experiments so far that machine learning 
technique can attain more that 90% correctness 
in classifying all objects in the image (step 8). 

So two problems to work on : 

• Step 6: AUTOPSF 

• final Step (8): classify objects in image. 
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A sample of base-level atributes derived in Step 4 
(re-evaluated in step 5): 

• isophotal magnitude 

• isophotal area 

• core magnitude 

• core luminosity 

• sky brightness 

• sky sigma (variance) 

• image moments (8): 

irl, ir2, ir4, 
rl, r2 

ixx, iyy, ixy 

• eccentricity (ellipticity) 

• orientation 

• semi-major axis 

• semi-minor axis 

attributes derived in step 7: 

• resolution fraction 

• resolution scale 


Matching with Higher Btsaj y fem 


Footprint 


Digitized Photo Plate 


CCD Image (high resolution ) 


Match Objects 


Matched Feature 
Vectors 


Classify objects by 
examining corresponding 
CCD image 


training example 


Have high resolution CCD frames for small 
areas of each plate: 

^ ma 8 es differ only in low intensity 
(faint) objects 

* Majority of objects on plate are faint. 

Objects in CCD image can be classified by 
astronomer. However, many of the 
corresponding objects on film image are too 
faint to be processed by inspection. 


pf CCD frames : (other than photometry) 

Obtain classified data for training learning 
algorithm. 

• Verify results obtained. 


Classification of faint objects 


Digitized In 


^Corresponding ^ 
—Match objec t 


Measure attributes 


Classify 


J j 

{ area, isophot.mag, ... ) [Class = sf ] 


attribute value s class 

Training Example 


Use machine learning to construct a ma pping- 


Attributes measured 
on plate 


Class 
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POSS-II Plate J442 

Star Galaxy 



Star Galaxy 

CCD Field A1632 


V J 


r 
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Results of classificat ion using all 
attributes: 

(including resolution scale & fraction) 


• used data from 3 plates in which objects 
were classified by astronomer after 
examining the corresponding CCD frames. 


RULER O-BTree 

94.1% 91.2% 


-GID3* 

90.1% 


ID3 

75.6% 


±0.79 ±1.1 ±1.3 


±5.1 


results are for = 1,100 (J380+J442) train, test 
within and outside plates / normalized attributes. 
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Neural Nets 


Why is the accurate object classification important? ' 
An example of a scientific challenge: 


For overall classification/star selection 
subproblem, used: 

• traditional back propagation 

• conjugate gradient optimization 

• variable metric optimization 
learning algorithms. 

Performance comparable with decision trees, but 
not as stable. 

Accuracy 94.2% (best) 30%(worst) 
(common range is: 76% - 84%, NC ) 
Avg. 81% ±5 

Major Problems: 

• Not interpretable 

• Expensive to train: 

- slow convergence 

- how to decide number of nodes? 

- initial weight settings? 


Star Selection: 97% (best), 

(common ran ge: 78%-92%\ 

V J 


We expect that the Mortem Sky Catalog will contain about 500 
quasars with - > 4. The problem is, how do we recognize them 
among the 200 million foreground stars? 

Multicolor selection may be able to bring the number of candi- 
date objects down to about 10,000. However. foUow-up spectroscopy 
with a return of ^ 5% would be prohibitively expensive. 

Experience by the Cambridge ( APM) group suggests that most 
interlopers would be misclassified g alaxi es 

Thus, a better star-galaxy separation would make a search for 
high-redshift quasars much more effective! 



F>g. I. A tm^ototr lj-R.X-1 plot oi *u liar obwcu n i kick niwu 
!&uta4« UKST .U.<U,4 UiTlU r^po. 

of fluun with «<J ii show* hf tk« lug* aUip**. a aarnkw of i n 
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THE STATUS OF THE PROJECT: 

• STATPROG In a regular scientific applications use, 
produces good science, new results 

• Major parts of SKICAT completed and tested, 
documentation being completed and improved 

• Some general utilities developed: 

• Data visualization: PONGO, a general, public-domain 
graphics package, a blend of the two most popular 
(in astronomy) graphics packages, PGplot and Mongo. 

• X-windowi based database quarry user interface 
(DB engine driver) 

• General catalog manipulation tools for SAS and SYBASE 
DB environments 

• Finding charts graphing tools 
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• Initial scientific verification and testing of SKICAT started: 

• Plate scans Quality control feedback to STScI 

• Faint galaxy counts and galaxy evolution at moderate 
redshifts (Weir, SGD) 

• Large-scale structure and galaxy correlations analysis 

(Brainerd, Weir, SGD) 

• Searches for high-redshift quasars (Smith, Weir, SGD) 

. . . and much more to cornel 
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GrADS Design Goals 


GrADS Design Goals 



INTERACTIVE 

Full Control of Data and Display 
Sub-Second Response Time 
Customizability - Scripting Language 

EASY TO LEARN AND USE 


HARDCOPY VIA VECTOR GRAPHICS 
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Interpreted Command Line Scripting Language 


A 


language Design: 

Programmability: as simple as possible 

Form GrADS commands via string manipulations and pass 
back to program for execution 

Return command results as script variables 


Langu 


ige Elements: 


Variables of type character'' 

Arithmetic and logical operators 
Built-in and user- specified functions 
Flow control: loops, lf/then/else 
Fully recursive 
Sample Usage: 

Automate commonly used command sequences 
Perform complex calculations 

Create new GrADS data files from results ot GrADS 
calculations 

Interact with the graphics screen 


f 
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r.rans DATA SETS 


FORMATS: GrADS INTERNAL -OR- GRIB -OR- DRS 
Binary 

Optimized for I/O Performance 

Support for arbitrary length flle/record headers 


CREATE 

Fortran OR C 
Standard I/O Statement 


MODIFY 

Fortran or C or UNIX file commands 

Update In place 

Extend 


USE IN OTHER APPLICATIONS 
Fortran or C 


PORTABILITY 

All UNIX Computers (E. G. NFS) 
DOS-based personal computers 


OTHER FORMATS CAN BE SUPPORTED 
(packed binary, ASCII, net CDF, etc.) 






TECHNOLOGY TRANSFER - The GrADS Experience 

There are hundreds of GrADS users at over 30 institutions 
They use GrADS for a wide variety of scientific investigations 
They haven’t complained much 

WHY? 

GrADS is easy to use and learn bv design 

GrADS allows scientists to access/mampulate/see their flatt 
without requiring additional programming 

It is easy for scientists to get their data into and out of GrADS 

We have been willing to work with user groups to add the 
functionality they request : we have paid attention to user 
feedback and eliminated their "show stoppers" 

We have packaged the software for distribution to make 
installation simple for a scientist = = > no system administrator 
assistance is required 

GrADS is platform independent - u runs on Unix workstations 
and PCs so scientists can do their work whe revel it suits them 
to do it 

GrADS is distributed free ot charge 


C1TRRF.NT GrADS USAGE 


Research: 

• Model output analysis 

■ Global atmospheric general circulation models 

• Global ocean models 
Tropical models 

* Coupled ocean-atmosphere models 

• Observational data analysis 

. Station data (African rainfall, Asian soil moisture, etc.) 
Gridded objective analyses 


Education; 


Interactive classroom use 
Student research projects 
» Student self-education 


Forecasting; 

• Real time observational data analysis 


Public information: 

• Daily weather forecasts 

• Maryland state ozone maps 

. Seminars with interactive displays 
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LL Mich ael Fiorino 
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Data Set: Hourly Surface Airways station observations 

Script: Interactive selection of domain, time, variable; 

objectively analyses station data and displays 
maps or station time series 


Daily Navy Operational Global Atmospheric 
Prediction' System (NOGAPS) analyses and 
forecasts with surface observations 

GrADS interface to NODDS • Navy 
Oceanographic Data Distribution System 
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Surface temperature anomalies for January - June 
1992. 
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Major New Features ] 


• GrADS Scripting Language (GSL) Enhancements 

- Global Variables 

- Compound Variables; Arrays 

• Batch Mode 

• Shell Escape 

• Support for GRIB Data Format 

• Multiple File Time Series 

• Additional Functions 

- Finite Differencing Functions 

- skip, const 

• User Defined Functions 

• Additional Graphics 

• Station Models 

- Label Formats (Axis and Contour Labels) 

- Enhancements to Colorized Vectors, Streamlines 

• General Coordinate Transformations 

• Data Editing 




SUMMARY 

CUSTOMIZABILITY 

• Scripting language added during past year permits users 
to customize virtually all aspects of GrADS functionality 

INTERACTIVE USAGE 

• Scripting language allows interaction with display for 
interactive scientific investigation using one’s own data 

USER COMMUNITY 

• GrADS is being used by a growing community of users 
nationally and internationally 

• User feedback has been extremely helpful in prioritizing 
development tasks and initiating new ideas 

PLANS 

• Other data format standards, GUIs for specific data sets 

FOR MORE INFORMATION 
Contact: 

BRIAN DOTY ... dotv(acola. umd.edu 
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A Distributed Analysis and Visualization System lor 
Model and Observational Data 


The project is part of a larger effort called PATHFINDER 
(Probing ATmospHeric Flows in an INteractive and 
Distributed EnviRonment) at NCSA. It is designed to 
bring to the researcher a flexible, modular, collaborative, 
and distributed environment for use in studying 
atmospheric and fluid flows, and which can be tailored for 
specific scientific research and weather lorecasting needs. 

The topics covered here include : 

• PATHFINDER Objectives 

• What We Have Accomplished 

• Deliverables and Time Fables 

• The Future 

• PATHFINDER Examples 

♦ SCSA Mosaic is used as the software lor this talk. 


r 
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PATHFINDER Objectives: 


• To utilize existing diagnostic and analysis software 
capabilities such as those found in GEMPAK (the 
GEneral Meteorological PAcKage built at Goddard Space 
Flight Center) 

• To access a variety of analysis and display capabilities 
including three dimensional rendering, animation, and 
collaborative tools for interacting with remote users on 
different workstations across the national network 

• To couple multiple and heterogeneous computers (e.g. 
SGI VGX, Cray Y-MP, CM-2, CM-5) 

• To exploit the use of off-the-shelf software, both 
commercial and public domain: IRIS Explorer, Inventor, 
A VS, I IDF, DTM 

• To manage large amounts (gigabytes and beyond) of 
data generated by satellites, observational lield programs, 
and model simulations 

• To incorporate video recording and playback, high 
definition monitors, and a virtual reality' viewer 

• To process multiple data streams (both model and 
observational! in creating a visualization 
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Software Environment 


Explorer 

• is the primary development and execution 
environment 

• is highly interactive 

• supports distributed computing 

• is easily extensible 

• provides user interface tools 

• has a rich collection of existing modules 

• NCSA is working closely with Explorer development 
team 


• is used to convert a subset of PATHFINDER 
modules on Convex and SGI machines 

• is highly interactive 

• supports distributed computing 

• is easily extensible 

• provides user interface tools 

• has a rich collection of existing modules 


• is supported on a lot of platforms 

• is a multi-object file format 

• has Fortran and C calling interfaces for storing 
and retrieving 8- and 24-bit raster images, 
palettes, scientific data and annotations 

• is the standard archive format for EOSDIS 


Inventor 


is an object-oriented 3-D toolkit 
provides a library of objects that you can use, 
modify and subclass 
simplifies application development 
facilitates moving data between applications 
with 3-D Interchange File Format 


• uses TCP/IP for transmission of data 

• is built on top of Berkeley sockets interface 

• has Fortran and C calling interlace 

• is ensv to use 


What We Have Accomplished 


GEMVIS 


• GEMVIS 

• Animation Tool 


Metadata 


• Data Acquisition 

• Visual Idioms 


• Laser Disk Recording 

• Prototypes 

• Explorer Alpha and Beta Testing 


• A subset of PATHFINDER 


A set of capabilities targeted to theGEMPAK user 
community : 


GEMPAK grid diagnostics & file access : 
the full range of GEMPAK functions 
and operators supported 
a new operator implemented for 3D 
vector data 

all GEMPAK grid data lormats 
supported 
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Animation Tool 


A 


all vertical coordinate systems handled 
: relevant modules: 

o GemGetGrid , GemGridFunction 

c map projections & topography : 

? a wide variety of map projections 
maps registered with data set 
visualizations 

topography displayed in appropriate 

projections 

relevant modules: 

GemDrawMap, GemTopo, 
GemGetGrid, and 
GemGridMapping 


• Implemented in both IRIS Explorer and AVS to 


reach widest audience 


A\ S 



composition of complex 3d animations 

• Uses keyframes to specify the animation of 
various scene parameters including: 

□ Camera position 

□ Object Movement 

□ Object Material 

□ Lighting/Environment 


• Packaged for easy use by current GEMPAK users 

• To be released throueh Lnidata & COSMIC 


* Can be linked to Explorer to animate Explorer 
output over: 

□ Time 

□ Analysis parameters such as isosurface 


threshold 


• Can produce raw image files for each frame or 
SGI/MPEG movies. 
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Metadata (new Grid data type) 


• Extends existing data formats to include critical 
descriptive data 



J 


• Is based on Explorer's Render module making 
it familiar to Explorer users. 


(l Data Acquisition 


J 



• Enables proper interpretation of data throughout 
the processing stream 

• Includes: 

3 labels for data fields, dimensions, and units 
geographic area and map projection data 

z missing data values 

- identification tags for geometries 

• Modules implemented: 

- readhdf2g, contour2g, isosurface2g t 
orthoslice2g, wirelrameg, HorizGridSlice, 
VertGridSlice, SliccTime, GemDrawMap, 
GemGelGrid, Gem i’opo, GemGridFunction, 
GridAxis Vertical, GridMapping, GridTitle, 
PrintG rid Metadata. 


• improved UI 

• multiple lattice outputs 

• lull N-d support for input/output 

• automatic backward compatibility with older HDF 
libraries 

• support all 3 Explorer coordinate types 

• provide detailed information for file contents (e.g. 
calculate min, max) 

• relevant module : ReadHDF2 

HDF image/ palette reader 

• relevant modules: Read HDF Palette, 

Read_H DF_ I mage 


1 J i 


DTM connection 




J 




• get data Irom processes on remote machines 

• relevant modules: Kead3DDTM 


J 
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Contour 

• capability to set contour interval in addition to # of 
contour levels 

• controls adjusted to input data range 

• show zero contour & dim negatives options 

• contour data range can be set fixed or data 
dependent 



Vector 

• different vector styles 

• different data to color mapping 

• different vector scaling 

• controls adjusted to input data range 

• relevant modules: VectorDisp 


• relevant modules: Contour! , Contour2g 



Annotation 



Particle advection 

• generates points in a user specified subvolume 
of input data space 


• annotated vertical axis with tick-marks and 
labels 


• advects all the points in space for the duration 
of display interval 


• geometric representation of a color map 

a 2D or 3D titles which incorporate metadata 

a relevant modules: Grid Axis Vertical, ColorKey, 
Grid Title 


a displays time dependent/independent particle 
trajectories 

a relevant modules: GenPoints, ParticleAdyector, 
ParticleTrace 




J V. 
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Laser Disk Recording 



• provides an interface to the Sony laser disk 
recorder 

• record user and title inlormation on the disk 

• relevant module : Laser_Disc_Recorder 




9 Frame flipbook 


Prototypes 


• an image flipbook for storage and playback of 
images 

• a geometry flipbook for storage and playback of 
sequences of geometric scenes while retaining 3D 
interactive control of viewing parameters 

• Simplified user interface to Explorer for basic 
visulization tasks 

• provides simple to use quick access to common 
visualization methods (contours, isosurfaces, etc.) 

• generates Explorer maps by clicking on a table of 
data variables vs visual idioms 

a DIALOGUE 

• is an object oriented message passing system for peer 
style communications 

• uses DTM as the transport mechanism 

• supports synchronous and asynchronous 
communication 

• is used to move data between running models and 


Explorer Alpha and Beta Testing 


Deliverables and Time Table 


• Numerous bug reports to Explorer development 
team 



September 1993: 

• beta release of NCSA Pathfinder Explorer modules to 
NCSA anonymous server 

• SGI binaries, standard module help files 

• release of GEM VIS to Unidata 

• module source, documentation, sample data 


Contributed to each Explorer release: 1.0, 2.0a, 

2.0b 1, 2.0b2 and 2.0 

Contributed to the concept of looping and animation 
in Explorer 

Contributed to the development of the scripting 
language 

Pushed Explorer to its limits 
_ Large data sets 

_ Virtual Kealitv 


October 1993: 


• beta release of Inventor animation tool to NCSA 
anonymous server 

• SGI binaries, help tile 

• release of GEM VIS to COSMIC 

• module source, documentation, sample data 
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Spring 1994: 

• Workshop on PATHFINDER 

• held at NCSA 

• other sites connected by video teleconferencing 


Summer 1994: 

• final release of NCSA Pathfinder Explorer modules to 
NCSA anonymous ftp server 

• updated SGI binaries, source code, user 
documentation, sample datasets, sample Explorer 
maps 

• Inventor animation tool source , user's guide, 
sample data files 


V J 
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Getting Started with NCSA Mosaic 

Marc Andreessen 
Software Development Group 
National Center for Supercomputing Applications 
605 E. Springfield, Champaign IL 61820 
marcaQncsa . uiuc . edu 

May 8, 1993 


1 Introduction 

NCSA Mosaic is a distributed hypermedia system de- 
signed for information discovery and retrieval over the 
global Internet. Mosaic provides a unified interface 
to the various protocols, data formats, and informa- 
tion archives used on the Internet and enables pow- 
erful new methods for discovering, using, and sharing 
information. 1 

This document is a brief overview of the Mosaic envi- 
ronment. For more information on any subject covered 
in this document, please feel free to send an electronic 
mail message to: 

■osaictocsa.uiuc.edu 

Alternately, feel free to contact the author directly. 


2 The Pieces 

NCSA Mosaic uses a client/server model for informa- 
tion distribution - a server sits on a machine at an 
Internet site fulfilling queries sent by Mosaic clients, 
which may be located anywhere on the Internet. 

Units of information sent from servers to clients are 
simply termed documents. Documents may contain 
plain text, formatted text, inlined graphics, sound, and 
other multimedia data, and hyperlinks to other docu- 
ments that may be in turn located anywhere on the 
Internet. 

The NCSA Mosaic for the X Window System client 
looks like Figure 1: each underlined word or phrase is a 
hyperlink-, clicking on it with a mouse button causes the 
client to connect to the appropriate server anywhere on 

i Mosaic i* baaed on the World Wide Web technology from 
CERN in Switxerland and u»« the World Wide Web common 
client library for much of it* low-level communication* layer. 


the Internet and to retrieve and display the referenced 
document. 



Figure L: NCSA Mosaic for X — Home Page 


When the Mosaic environment is used to serve and 
access a document containing formatted text and in- 
lined graphics on the Internet, the result may look some- 
thing like Figure 2. 
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Figure 2: NCSA Mosaic for X — Inlined Image 


3 Setting Up The Client 

Currently. NCSA has developed and released version 1 .0 
of the X Window System Mosaic client, which can be 
used on almost any modern Unix-based graphics work- 
station (e g., Sun Sparc, IBM RS/6000, DEC 5000 or 
Alpha, Silicon Graphics IRIS). The Macintosh client is 
under development and slated for 1.0 release in sum- 
mer ’93, and work on the Microsoft Windows client will 
begin in early summer ’93. The remainder of this doc- 
ument currently assumes use of the X client. 

The NCSA Mosaic anonymous FTP site is: 
ftp . nesa . uiuc . adu 

Binary executables of the X Window System client for 
several common Unix platforms are in this directory: 
/Mosaic/ xnosaic-binaries 

Source code for the X Window System client is in this 
directory: 

/Mosaic/ xnosaic-source 


3.1 Basic Client Setup 

Pulling down, installing, and running a Mosaic binary 
is straightforward: there are no configuration files or en- 
vironment variables that need to be set for the program 
to work. 

Note: If you are running on a Sun workstation under 
Open Windows, you may experience a large number of 
spurious warnings when you first execute the program; 
these are not serious and will not impair the proper 
functioning of the program, and the online Frequently 
Asked Questions list explains how to make them go 
away. 

Note: When Mosaic is executed, the first thing it 
does is pull the Mosaic “home page” (startup docu- 
ment) over the Internet from an NCSA server. If you 
get an error message instead of this home page, your 
connection to the Internet is less than complete. Please 
either run Mosaic on a system fully connected to the 
Internet or contact the author for more information. 

3.2 Complete Client Setup 

To allow interaction with a wide variety of data for- 
mats for images, audio, video, and typeset documents 
currently popular on the Internet, Mosaic relies on a 
number of external neuters: programs separate from 
Mosaic that are invoked when necessary to display cer- 
tain types of data. 

Here is a list of recommended external viewers for use 
with Mosaic: 

• xv: A popular image display utility for X Window 
System platforms, xv can display images encoded 
in GIF. JPEG, TIFF, and several other formats. 
xv can be obtained via anonymous FTP to: 

export . les . ait . «du 

The filename of xv on that server is currently: 

/ contrib/xv-3 . 00 . tar . Z 

• shoutaudio: A shell script capable of transparently 
playing several formats of audio on several plat- 
forms, showaudto is available as part of the meta- 
mail multimedia mail toolkit available at the fol- 
lowing FTP cite: 


The NCSA Mosaic Technical Summary, which de- 
tails Mosaics current capabilities and our plans for 
further development in 1993. is in this directory: 
/Mosaic/mosaic -papers 


thumper . bellcore . com 

The current filename for the metamail distribution 
is: 

/ pub/ nsb/wm2 . 4 . t ar . Z 
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The metamail program itself is used by Mosaic to 
process multimedia documents that use the MIME 
format. MIME is not yet widely used as a docu- 
ment format, but it is becoming increasingly pop- 
ular and is likely to become a ie facto Internet 
multimedia document standard in the near future. 

• mpeg.play : A display utility for MPEG-format an- 
imation sequences and video clips, mpeg.play is 
available from the following FTP site. 

postgres . berkelsy • edu 

The current filename is: 

/pub/mult im«dia/mp«g/mpeg_play“2 . 0 . tar . Z 

External viewers useful for viewing typeset docu- 
ments in TfeX's DVI format and in PostScript are harder 
to compile and install, but here’s what we recommend: 

• ghost view : A user interface to a PostScript viewer 
called ghostscnpt , gkostvie w offers the best freely 
available online PostScript viewing technology cur- 
rently available. The FTP server is: 

uxe . cso . uiuc . edu 
You will need the following files: 

/gnu/ ghost script-2 .5.2. tar - 2 
/gnu/ ghost scr ipt-f onts-2 .5.2. tar . 2 
/gnu/ghoatview-1.4. l.tar .Z 

• x dvt: An online viewer for documents in l^X’s DVI 

format, zdvi can be found at: 
export . lcs . mit . edu 
The filename is: 

/contrib/xdvi . tar . Z 


NCSA has developed a freely available HTTP server. 

It is currently in late beta and is available at. 
t tp . ncsa . uiuc . edu 
The directory is: 

/Mosaic/ncsajittpd 

Full configuration instructions are provided with each 
of the binary and source distributions available there. 
Feel free to contact the author if you have any questions 
or problems. 

5 Where To Go From Here 

Assuming you have successfully installed the Mosaic 
client on your system, you should spend some time to 
become familiar with the interface it presents to the 
universe of information already available on the Inter- 
net. The Mosaic “demo document", available via a 
hyperlink in the Mosaic home page, is a self-contained 
overview of Mosaic’s capabilities with hyperlinks to a 
wide variety of interesting information sources. 

5.1 Serving Information 

If you wish to serve information to the Internet, you 
should first learn about HyperText Markup Language 
(HTML), the SGML-based markup language used for 
formatted hypermedia documents in Mosaic. 

You should also learn about the Uniform Resource 
Locator (URL 3 ) scheme for consistently naming docu- 
ments accessible on the Internet. 

At the very end of the Mosaic demo document (men- 
tioned above), there are hyperlinks to primers on HTML 
and URL’s. Those primers in turn point to more ad- 
vanced information sources. 


These are the basic external viewers that Mosaic will 
assume are present when it encounters multimedia data 
of the appropriate types. There are other such viewers, 
more information is available in the online documenta- 
tion. 


4 Setting Up The Server 

You can serve documents to NCSA Mosaic sessions 
running anywhere on the Internet from an ordinary 
anonymous FTP server. However, for performance rea- 
sons, we recommend setting up a HyperText Transfer 
Protocol (HTTP) server 2 . 

2 HTTP is a stateless, lightweight, and extremely fast alterna- 
tive to FTP for network file transfers in the World Wide \ eb 
environment. 
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3 The “U" in URL previously stood for Universal. 
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- Earth based telescopic images 

- Data from terrestrial meteorological sai 
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JPL LinkWinds 

The Linked Windows Interactive Data System 
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DATAHUB: SCIENCE DATA MANAGEMENT IN SUPPORT 
OF INTERACTIVE EXPLORATORY ANALYSIS 

Thomas H. Handley, Jr. 

Mark R. Rubin 
Jet Propulsion Laboratory 
Pasadena, California 91109 


Abstract 

The DataHub addresses four areas of significant need: 
scientific visualization and analysis; science data 
management; interactions in a distributed, 
heterogeneous environment; and knowledge-based 
assistance for these functions. The fundamental 
innovation embedded within the DataHub is the 
integration of three technologies, viz. knowledge-based 
expert systems, science visualization, and science data 
management. This integration is based on a concept 
called the DataHub. With the DataHub concept, science 
investigators are able to apply a more complete solution 
to all nodes of a distributed system. Both computational 
nodes and interactive nodes are able to effectively and 
efficiently use the data services (access, retrieval, update, 
etc.) in a distributed, interdisciplinary information 
system in a uniform and standard way. This allows the 
science investigators to concentrate on their scientific 
endeavors, rather than to involve themselves in the 
intricate technical details of the systems and tools 
required to accomplish their work. Thus, science 
investigators need not be programmers. The emphasis is 
on the definition and prototyping of system elements 
with sufficient detail to enable data analysis and 
interpretation leading to information. The DataHub 
includes all the required end-to-end components and 
interfaces to demonstrate the complete concept. 


* Technical Group Supervisor 
Member of AIAA 
•' Member of Technical Staff 
Copyright © 1993 by the American Institute of 
Aeronautics and Astronautics, Inc. No copy right is 
asserted in the United States under Title 17, U.S. Code. 
The U.S. Government has a royalty-free license to 
exercise all rights under the copyright claimed herein for 
government purposes. All other rights are reserved by 
the copyright owner. 


Setting the Stage - The Issues 

It is difficult, if not impossible, to apply existing tools for 
visualization and analysis to archived science instrument 
data J2J. This difficulty is generally the result of (1) 
incompatible data formats and the lack of available data 
filters; (2) the lack of true integration between the 
visualization and analysis tools and the data archive 
system(s); (3) incompatible and/or non-existent 
metadata; and (4) the exposure of the scientist to the 
complexities of networking. These problems will be 
multiplied by the avalanche of data from future NASA 
missions [8, 32J. New modes of research and new tools 
are required to handle the massive amount of diverse 
data that are to be stored, organized, accessed, 
distributed, visualized, and analyzed in this decade [4, 
26]. 

The areas of most immediate need are: (1) science data 
management; (2) scientific visualization and analysis, (3) 
interactions in a distributed, heterogeneous 
environment; and (4) knowledge-based assistance for 
these functions. The fundamental innovation required is 
the integration of three automation technologies: viz. 
knowledge-based expert systems, science visualization, 
and science data management. This integration is based 
on a concept called the DataHub. 

With the DataHub, investigators are able to apply a 
complete solution to all nodes of a distributed system. 
Both computational nodes and interactive nodes are able 
to effectively and efficiently use the data services (access, 
retrieval, update, etc.) in a distributed, inter-disciplinary 
information system in a uniform and standard way. 
This enables the investigators to concentrate on their 
scientific endeavors, rather than to involve themselves in 
the intricate technical details of the systems and tools 
required to accomplish their work; thus, investigators 
need not be programmers. 

DataHub addresses data-driven analysis, data 
transformations among formats, data semantics 
preservation and derivation, and capture of 
analysis- related knowledge about the data. Expert 
systems will provide intelligent assistant system(s) with 
some knowledge of data management and analysis built 
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in. Eventually DataHub will incorporate mature expert 
system technology to aid exploratory data analysis, i.e., 
neural nets or classification systems. Additionally, as a 
long term goal, DataHub will be capable of capturing 
and encoding of knowledge about the data and their 
associated processes. The DataHub provides data 
management services to exploratory data analysis 
applications, i.e., LinkWinds [23], PolyPaint+ [15], 
exploratory data analysis environments. 

In developing DataHub we utilize the problems as posed 
by the science co-investigators to aid in directing 
capability and development decisions. DataHub’s 
general problem-solving structure will be applied in the 
general science problems, as described by the science co- 
investigators. 


Goals and Objectives 

Our goal is to integrate the results from science data 
management, visualization, and knowledge-based 
assistants into a scientific environment; to demonstrate 
this environment using real-world NASA scientific 
problems; and to transfer the results to science 
investigators in the appropriate disciplines. 

The specific objectives of the DataHub work are to: 

1. Define and develop an*integrated system that is 
responsive to the science co-investigator's needs. 

2. Demonstrate the interim capabilities to the 
participating science users of the system in order to 
receive their suggestions. 

3. Transfer the results of this effort to a broad base of 
science investigators as appropriate. 

4- Provide a system that will enable the science 
investigator to obtain publishable scientific 
information. 


Emerging Relationships 

As illustrated in Figure 1, LinkWinds is providing two 
functions: (1) a visual data exploration or analysis 
environment; and (2) visual browsing and subsetting 
services. In the first function, LinkWinds will be notified 
via a message of the presence of data. The existence of 
this data will be incorporated into the LinkWinds 
database menu and, hence, be made available to the user 
immediately. The second function will be used when it 
is more convenient to graphically select the subsetting 
attributes. After selection of the attributes, a message 
will be sent to DataHub, the filtering accomplished, and 
the results re-submitted to LinkWinds for analysis. 

A new link is being established with PolyPaint+. 
PolyPaint+ will provide a interactive visualization of 


complex data structures within three-dimensional data 
fields, in addition to visual subsetting services. 
Interactions with PolyPaint+ will require DataHub to 
expand its understanding of formats and data, and to 
provide different filtering capabilities. 

The application of machine learning techniques to 
feature recognition in datasets of interest at JPL. The 
specific problem is to detect and categorize small 
volcanoes on Venus using the Magellan SAR data. The 
techniques is user interaction for feature selection and 
machine learning will be directly applied to the pre- 
processing tools used in the DataHub environment. 

The Navigation Ancillary Information Facility provides 
a capability called SPICE (Spacecraft, Planet, 
Instrument, C-matrix, and Events)[19]. SPICE contains 
all the ancillary data associated with a mission. The data 
along with an extensive library are available concerning 
an expanding set of missions. The SPICE capability, 
initially developed to support science analysis, is now 
available as a toolkit. It is our intention to investigate 
the use of the SPICE toolkit in association with other 
applications to provide needed ancillary data and 
processing. 


A pproach 

We have analyzed the management of distributed data 
across different computing and display resources. 
Subsequent to this analysis and design, we implemented 
the specific components required to provide needed 
science functions. Several prototypes have been 
provided to illustrate the capabilities. Additionally, we 
have attempted to apply knowledge-based expert 
system and machine learning technologies to provide 
assistants" for the science investigator in data discovery 
and selection, tools selection and science processing. 
Today's solution, DataHub, takes the first steps toward 
the integrated solution needed to provide the means to 
satisfy the technology and science requirements in the 
1990s by providing a high performance, interactive 
science workstation with the capabilities to handle both 
exploratory data analysis and science data management. 


The Basic Concept 

Figure 1 depicts the current functional architecture for 
the DataHub. The major functions of the DataHub 
include providing (1) an interactive user interface; (2) a 
command-based query interface; (3) a set of data 
manipulation methods; (4) a metadata manager; and (5) 
an underlying science data model. The interactive user 
interface, basic data operators and a data interchange 
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interface with LinkWinds have been implemented in the 
initial prototypes. 

The command-based query interface, such as with 
LinkWinds illustrated by the double-headed arrow in 
Figure 1, is designed for the data visualization system to 
issue data management commands to the DataHub. The 
data manipulation methods provide the selection, 
subsetting, conversion, transformation, and updates for 
science data. The metadata manager captures the 
necessary knowledge about science data. Finally, the 
science data model supports the underlying 
object-oriented representation and access methods. 

Figure 2 depicts the current software architecture. A 
layered architecture has been adopted for the 
implementation, which implies that any layer can be 
changed and/or replaced without affecting other layers. 
The top layer is the external interface that links to the 
human users via an interactive interface provided by 
DataHub or the visualization system via a connection 
interface. The data model is implemented in the 
intelligent data management layer. The data interface 
layer provides the physical data access functions. 


Current Capabilities 

DataHub Version 0.5 has been implemented and tested 
in the Sun SPARCstation and the Silicon Graphics 
environments. The implementation uses the software 
structures illustrated in Figure 2. 

From a user’s stand point, DataHub 
recognizes/ understands several common datasets either 
by name or format, plus several other popular formats. 
The datasets include MCSST, CZCS, Voyager, Magellan, 
AVIRIS, Viking, and AirSar; the formats include VICAR 
[17], DSP, HDF [24], netCDF [20, 27], and CDF [3]. 
Present preprocessing capabilities are data filters, e.g., 
temporal or band selections, subsampling and averaging 
options, and spatial subsetting. With the data link with 
LinkWinds, the user may select and process a dataset of 
interest then proceed to the LinkWinds environment for 
exploratory data analysis. 

The current DataHub user interface and a typical user 
session including interactions with LinkWinds are 
illustrated in Figures 3 and 4 respectively. A description 
of the interface design update and development my be 
found in [12]. 
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Figure 2 — Software Architecture 


Our initial experience with knowledge-based or machine 
learning technology was based on work accomplished 
using artificial neural nets. This work was spurred by 
our science co-investigators' needs to model regions of 
the ocean for which the visible and infrared imagery is 
obscured by clouds, and thus extrapolating biological 
and physical variables from cloud-free regions in space 
and time to the cloud-obscured regions. This produced 
acceptable science products but required too much 
technical expertise to translate into a generic tool. As 
described above, new machine learning techniques are 
being investigated to provide feature recognition 
capabilities with a more user-friendly interface. 


A Recent Developments 
Context Sensitive Help 

The DataHub user interface is intended to be self- 
explanatory and intuitively usable with little or no 
instruction. In the area of user interfaces, however, 
intent and reality often diverge. 

In packaging DataHub for distribution to a user site 
outside the development environment, it was obvious 
that a traditional "README" file was needed to detail 
installation instructions. It was also clear that although 
the DataHub user interface had largely succeeded in 


achieving its goal of intuitive usability, there remained a 
need for a small amount of instruction to get the first- 
time user started. While writing a short (< 10 
paragraphs) explanatory document, it became obvious 
that this text could be integrated into the main help 
system that had been designed into DataHub. 

A benefit of using the X Windows resource manager to 
control an application's user interface is the ease with 
which all aspects of the interface can be customized. 
Textual material can be modified as simply as more 
traditional customizable user interface elements such as 
colors and layout. Because of this, any instructional text 
that might otherwise be included in a separate help 
document (either hard-copy or on-line) can be easily 
integrated as a dynamic part of the application itself, and 
eliminate the problems of help being unavailable or not 
findable when needed. 

At the same time, a full-blown hypertext system is 
neither needed or appropriate for DataHub. Help for 
DataHub falls into two categories: Initial, new user help, 
and context-based help for particular DataHub 
capabilities. The former can be satisfied by a fairly large 
(as dynamic, on-screen help texts go) set of instructions, 
and the latter by small explanations easily accessible 
while the user is performing, or contemplating 
performing, a DataHub operation. In particular, the 
navigation of a help system is replaced by the navigation 
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Figure 3 -- Current DataHub Interface 



Figure 4 - Typical User Interaction 
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of the DataHub user interface itself, with single-level 
help available at each node of the interface. 

Multiple, individual, help buttons fit naturally in many 
parts of the DataHub user interface. A help pulldown 
menu was added to the section of main DataHub 
window devoted to generic DataHub control issues. It is 
in this menu that an item for popping up the 
introductory text was placed. Additionally, all normal 
DataHub popup windows have help buttons that popup 
text dialogs containing help on their particular subject. 

More difficult was deciding how to access help for 
graphical user interface elements (i.e. for the interface 
element's operations) in cases where the interface was a 
single button or menu with no place for a separate help 
button. Pulldown menus can have an additional help 
item added; simple pushbuttons cannot. 

A context help mechanism was implemented for the case 
of pushbuttons, see Figure 5. The user selects "Context 
Help ..." from the main help pulldown menu. DataHub 
acknowledges this input by changing the mouse cursor 
to a question mark ("?") shape. 

The user can then move the question mark cursor to any 
element of the DataHub interface, and release it to see a 
help dialog about that element. The underlying code 
sends a message requesting help to the object 
representing the graphical element, which in turn 
displays its textual help. 

This method handles any and all kinds of graphic 
elements, regardless of their screen real-estate 
limitations. In fact, in the case where an element has a 
dedicated help button, the context-help method also 
works, invoking the same message and displaying the 
same help dialog. 

Additionally, help hierarchies are a natural by-product 
of this implementation. Dropping the question mark 
cursor onto a graphical element gets help on that subject. 
Dropping it into the area surrounding the element gets 
more generic help on the type of interaction the element 
is a part of. For example, selecting "Subsampling Factor" 
or "Averaging Factor" displays help on their respective 
topics, but selecting physically between the two displays 
help on the subject of subsetting data in general. 

The help system can grow and evolve using this 
framework. If the user drops the question mark cursor 
onto a graphical element that does not have a help 
message defined, the message automatically propagates 
to the ancestor of the element, repeating this process if 
necessary until it finds one that does have a defined help 
method. In this way, the user can get help (although 


more general) even when specific help is yet to be 
implemented. 



Portability 

Since the goal is to provide an extensible system capable 
of evolving to provide solutions to broader science and 
engineering domains, portability is a significant issues. 
Initially, we conceived using a combination of C, 
PROLOG, and Common Lisp for the implementation. 
Today, protablity and minimizing the cost to the user is 
being addressed by using common platforms (viz.. SUN 
SPARC stations, Silicon Graphics) and portable and 
public domain tools (viz. C/C++, FORTRAN, X/MOTIF, 
CLIPS, UNIX and SQL database interface). 


netCDF Data Format 

The data format Network Common Data Form (netCDF) 
was developed by the Unidata Program, sponsored by 
the Division of Atmospheric Sciences of the Nations 
Science Foundation. The emerging standard is 
distributed as an I/O library which stores and retrieves 
scientific data structures in self-describing, machine 
independent files. DataHub now recognizes this format. 

The current implementation supports 

• recognition of netCDF as a file type. 

• a set of rules for conversion of netCDF to and from 

HDF format. 

This new capability has been included to facilitate the 
use of netCDF data in LinkWinds and HDF data in the 
PolyPaint+ environments. 
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At this time, netCDF can be seen as providing richer 
structures. This is supported by the breadth of metadata 
annotations available as native functions. We found 
translation from HDF to netCDF more straight forward 
than the reverse. 


What needs to be done? 

From the design point-of-view, we have defined a 
general framework for science data management, and 
identified a critical subset of data operators for the 
science data applications. From an implementation 
perspective, we have developed prototypes that enable 
validation of basic concepts of data resource sharing 
between the data suppliers and data consumers (e.g. a 
data visualization system such as LinkWinds). 

Based on the object-oriented design of DataHub, it is 
straight forward to extend the data model to capture the 
definitions of an existing relational data system. For 
example, the comprehensive data catalog built by the 
Planetary Data System (PDS) will become part of 
DataHub’s data model with specialized data access 
methods defined to access the existing information in 
PDS via a standard SQL interface. This approach makes 
discipline-oriented knowledge readily available to 
DataHub. Additionally, expanded knowledge about 
data formats and data semantics in various science 
disciplines will be built into DataHub. It is a goal that 
the understanding of the visualization and analysis tools 
will also become part of DataHub such that special data 
operators will be built automatically using basic known 
operators. The data quality assessment issue of science 
data after data transformation will be a research area for 
DataHub, and will be addressed in the next steps. 

We will enhance the existing prototype to provide access 
to additional data sets while expanding the capabilities 
for direct support to the science co-investigator. 
Particularly, the issues associated with processing multi- 
spectral data will be addressed. We will be enhancing 
the preprocessing capabilities by accessing and utilizing 
the NAIF SPICE ancillary data as it become available. 

Besides continuing to evolve to a more object-oriented 
implementation, several issues will be addressed. When 
data transformation or conversions take place, we need 
to assure the preservation of data validity or quality 
measures. We need to treat the data quality assessment 
issues such as (1) treatment of missing data and (2) data 
quality associated with data interpolation, data 
transformation, etc. 

Expanded knowledge about the data is of significant 
importance. This includes knowledge of data formats 
(e.g. usage of metadata embedded in the data set 


headers), data semantics (e.g. meaning of data values, 
relationships between data sets, discipline-dependent 
data access/analysis methods) and data semantics as 
represented by the users' context in the visualization 
regime (e.g. what are the links, dataflows, etc. as 
encapsulated in the LinkWinds environment). The 
ability to detect and understand this expanded 
knowledge will be incorporated into the label- 
understanding expert system. 

Additional understanding of the analytical tools 
required for data selection, data transformation and data 
conversion in order to support the visualization 
requirements is needed. These may to thought of as 
filtering tools to select and prepare data for use in the 
visualization environment. These additional tools will 
be defined and implemented. 

In those cases where selection criteria are so complex 
that they are most easily exercised visually, it is clear 
that a close integration of the database management 
system, and the data visualization system is 
advantageous. Such integration will be studied by 
closely tying DataHub with LinkWinds so that DataHub 
will be accessible from LinkWinds and LinkWinds will 
be accessible from DataHub, each being used to best 
advantage in the data management processes. 

Finally, we will address the issues associated with data 
presentation. In particular, data exchange protocols that 
facilitate visualization are to be addressed first. 

Major Components 

DataHub will be enhanced to include these capabilities: 

• interactions to support finding, selecting and 
processing multi-spectral datasets (initially 
AVIRIS). 

• band aggregations 

• band filters (e.g., removal of artifacts of the 
instrument) 

• 3D subsetting/averaging 

• journal and transaction management will playback 

capability. 

• expanded data model that includes user-defined 
data conversions. 

• canonical set of data objects and methods 

• self-describing data objects and methods 

• user defined defaults for spatial regions, temporal 

periods, etc. 

• incorporate the metadata into the interfaces with 
LinkWinds and PolyPaint+. 

• expanded rule-based capability to understand 
foreign datasets, leading to a capability for 
interpretative conversions and transformations. 

• expanded data dictionary for use in label 
recognition, plus the ability to dynamically add 
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new object attributes once their semantics are 
clearly understood. 

• initial usage of calibration and registration data 

• quality measures, to include 

• processing lineage 

• null and missing value recognition and usage in 
processing 

• incorporation of content-based applications such as 

the machine learning capability described above 

• expanded interactions with LinkWinds and 
PolyPaint+ 

• distribution of DataHub processing and interactions 

and remote services. 

Using the DataHub, scientists will request data for 
presentation and analysis in a specific way for use in the 
their applications, without being particularly concerned 
with the original location and format of data being 
utilized. Applications adhering to the DataHub 
protocols and interfaces may interoperate sharing results 
through the DataHub. 

As described previously, LinkWinds will be enhanced to 
have two-way communications with DataHub. Besides 
receiving the user’s selected data for analysis, 
LinkWinds will provide graphical subsetting and 
transformation parameters and send a processing 
request for DataHub to execute and return the desired 
data. 

PolyPaint+ will have a similar interface as LinkWinds. 
After this communications and processing link has been 
implemented, DataHub will be enhanced to provide 
more specialized processing for the PolyPaint+ 
community (that is say, netCDF, super computing, and 
modeling). 

Machine Learning and Feature Detection 

It is difficult for a scientist to examine and understand 
data with a large number of dimensions. Scientific 
visualization tools are one means for performing 
necessary transformations and dimensionality reduction 
to allow a scientist to ’’see” meaningful patterns in the 
data. However, these require that the scientist specify 
the necessary steps. Faced with multi-spectral remote- 
sensing data arriving over more than 200 channels, 
expecting a scientist to study the entire data set becomes 
unreasonable. This often results in using only parts of 
the data channels or using the data in very limited ways. 
An automated tool for aiding the analysis of such high 
dimensional data sets would enable scientist to get at 
more of the information contained in the data. 

We will use machine learning and pattern recognition 
techniques to aid in the analysis of multispectral data. 
Consider a scientist interested in characterizing certain 


regions in the data, for example, locating the areas on 
earth where certain minerals are present, or where some 
phenomenon of interest occurred. By selecting portions 
of the data of interest and others that do not contain 
phenomena of interest, a scientist is essentially pointing 
out examples (instances) of the desired target. These can 
be treated as training data, and used by learning 
algorithms to automatically formulate classifiers that can 
detect other occurrences of the target pattern in a large 
data set. Furthermore, since the learning algorithms are 
capable of examining a large number of dimensions at 
once, they may be able to find patterns that would be too 
difficult for a scientist to derive by manual analysis. In a 
sense, this offers the option for a '’logical” versus a 
’ visual” visualization of the patterns in the data. That is, 
the algorithms produce a characterization of subsets of 
interest in the data in terms of logical expressions 
involving multiple input variables (channels). Often, it 
is possible to express such patterns in terms of compact 
rules involving an unexpectedly small number of 
variables. For example, channels 104 and 202 being in 
certain ranges may be highly predictive of a 
phenomenon that the scientists could not easily 
characterize using the first six channels. 

The use of learning algorithms thus provides flexibility 
in terms of adapting to a wide variety of detection 
problems. Our decision tree based learning algorithms 
produce rules that are easily examined and understood 
by humans. This contrasts with a statistical regression or 
neural network based approach, where the resulting 
forms are difficult to interpret. 

Distributed Blackboard System 

The blackboard model allows for a flexible architecture 
with diverse knowledge sources cooperating to 
formulate a solution opportunistically (16|. A 
distributed blackboard system running across multiple 
workstations can allow multiple scientists in different 
physical locations to work together on a single problem 
cooperatively. 

The DataHub metadata manager has been ported to a 
distributed environment across multiple Sun 
SP ARCstations [22] . This distributed environment is the 
underlying layer of an ongoing distributed blackboard 
implementation. It is expected that the DataHub system 
can sit on top of this blackboard system to function as a 
Groupware for multiple scientists from multiple science 
disciplines. 

With this capability, DataHub can distribute the data 
access and data conversion load across multiple 
computers. At the same time, multiple users can access 
multiple data sources via this distributed scheme of 
DataHub. 
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With the distributed blackboard, DataHub can have 
multiple data servers with metadata (i.e., discipline 
knowledge) about multiple data sources sitting across 
the network. Each data server acts as an independent 
knowledge source in the blackboard system. The 
DataHub data servers use a consistent data access 
mechanism provided by DataHub. The scientists use a 
consistent user interface of DataHub event though they 
are running the DataHub data client on their own 
workstations geographically separated from one 
another. 

The inter-disciplinary knowledge about data can be 
stored in higher level knowledge sources (i.e., agents) in 
the blackboard system. Whenever a scientist has a need 
of a dataset that is outside a single discipline, this inter- 
disciplinary knowledge source is utilized to provide 
intelligent data access capability to access the right data 
from the right source. 

The distributed blackboard is implemented using a 
reliable distributed computing protocol provided by 
Cornell’s ISIS [1, 7]. ISIS version 2.1 is in the public 
domain. The concept of having process groups in a 
distributed environment with guaranty on message 
arrival sequence for messages from multiple senders fits 
the need of the blackboard implementation. 


Development and Deliverables 

We have planned three steps in the next phase of 
DataHub prototyping: 

Step 1. 

• Design and develop DataHub processing of multi- 

spectral data sets for the science co-investigator. 

• Initiate the distribution of DataHub processing and 

provide general remote services. 

• Design and develop interfaces to PolyPaint+. Collect 

functional requirements from the user community. 

• Design and develop the machine learning interface. 

• Demonstrations will use the data sets as determined 

by the science co-investigator. 

Step 2. 

• Provide data abstraction and knowledge engineering 
to support applications in the LinkWinds and 
PolyPaint+ environments. 

• Demonstrations will use the data sets as determined 

by the PolyPaint+ user community. 

Step 3. 

• Provide the knowledge engineering required to 
utilize the computing environment and its tools. 
Incorporate this knowledge into the DataHub. 

• Provide support within the DataHub of all the 
required datasets (homogeneous/regular and 
heterogeneous). . 


• Demonstrations will use the data sets as determined 
previously. 

The development cycle used to solve the problems 
addressed above will be to: define/ expand the science 
co-investigator’s problem; design, implement, integrate 
test, demonstrate, evaluate, and transfer to the scientist 
co-investigator; and then iterate these steps. In each 
cycle these areas will be addressed: (1) The DataHub, (2) 
Knowledge-based assistance for the DataHub; (3) 
Machine learning for feature recognition; (4) A problem 
posed by a science co-investigator ; and (5) 
LinkWinds/PolyPaint-t- interface and protocol. 

An incremental development methodology will be 
utilized: ”do-a-little, test-a-little". 

Throughout the implementation effort, the science co- 
investigator and other scientists will participate in the 
design. This feedback and evaluation is important in 
providing a product that contributes to the scientists' 
ability to accomplish their science objectives. The 
success of the proposed work will be measured by the 
science utility of the work products. 


Benefits and Expected Results 

The principle product of the proposed work is the 
demonstration of an integrated environment in which a 
science co-investigator will be able to accomplish data 
analysis and interpretation leading to publishable 
scientific information. Thus, DataHub is addressing 
broad aspects of: 

1. Providing innovative ways to facilitate the scientific 
endeavor or "mean-time to discovery" [33] when 
working with large volumes of data. The 
traditional computing data life-cycle is typically a 
sequential process. This traditional view provides 
sequential support to what is actually a highly- 
interactive, iterative process. DataHub will provide 
a data life-cycle as illustrated in Figure 3. 

2. Providing access to remote data, local data filtering 
and management and interactive exploratory data 
analysis. 

3. Applying knowledge-based expert systems and 
machine learning at the original data selection, in 
intermediate data filtering and in rule-based 
applications. 

The DataHub will provide an end-to-end solution to 
problems of this generic type, thus enabling science 
investigators to produce higher-level products through 
an analysis environment which provides an integration 
of required functions. This environment consists of. 
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1. An interface between the scientific visualization 
and analysis environment and the data required to 
perform the analysis. 

2. Expert system / knowledge engineering-based 
analysis assistants and machine learning techniques 
to do: 

- data discovery and data selection 

‘ Mature and image understanding preprocessing 

- visualization and analysis tool selection 

3. The LinkWinds and PolyPaint+ environments and 
their analysis tools as the visualization mechanism 
and user interface environment. 

The benefits to NASA deriving from the DataHub 
include: 

1. Ability to analyze massive volumes of data in a cost- 
effective manner. 

2. Freedom for the NASA mission scientists to do the 
interpretative, creative aspects of science work. 

3. An advanced prototype for science support. 

4. Availability of common system modules and data 
formats for other developers. 



Figure 3 - Knowledge-Based Visualization 
and Analysis 
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The primary productivity of phytoplankton in the ocean is largely responsible for the assimilation 
of carbon into the oceanic environment, and thus in part the removal of carbon from the 
atmosphere. Because the ocean is thought to be a primary sink for atmospheric carbon, the basin 
wide and global distribution of oceanic primary productivity is of central importance in the global 
budget of carbon. To understand the global productivity of the oceans, the interactions between 
the physical and biological structures must be known. The biological population of the ocean 
is highly variable both spatially and temporally on all time and space scales. The global nature 
of this problem then requires the use of satellite instrumentation as the only platform capable of 
providing coverage on temporal and spatial scales that are appropriated to the assessment of 
carbon flux in the ocean. The goal of this research is to increase our understanding of the 
sources of variability in the sea to provide a more accurate assessment of oceanic productivity 
from ocean color imagery. The objectives of this research are the description of the spatial and 
temporal distributions and variability of the planktonic community in the sea and primary 
productivity of that community. To achieve these objectives, remotely sensed data of the spatial 
and temporal distributions of pigment concentration and sea-surface temperature are required to 
provide a global description of the seasonal variability of the water column primary productivity. 


To address the broader context of the primary productivity of the sea, the physical and biological 
processes and their variability, including changes in water mass, incident irradiance, nutrients and 
consequent formation of blooms of difference species of marine phytoplankton and bacteria must 
be studied. In this investigation, we will use time series of the pigment distributions, taken from 
the Coastal Zone Color Scanner (CZCS), and of the sea-surface temperature, taken from the 
NOAA Advanced Very High Resolution Radiometer (AVHRR). These time series will be 
examined to determine the spatial and temporal statistics of productivity, including the 
interannual variations that occur in productivity caused by variations in the physical environment. 


For this task we have chosen to use monthly composite global maps created from the satellite 
imagery. The pigment maps are created from the ratios of upwelling radiance at 440, 520 and 
550 nm, and have been composited from a data set that is characterized by a sparse data coverage 
because of the presence of clouds and because of the sampling characteristics dictated by the 
Nimbus-7 satellite operations. The monthly composite images from the CZCS contain significant 
regions for which no data exist. Attempts to estimate the global primary productivity of the 
ocean from these composite images have yielded a preliminary assessment of the net annual flux 
of carbon from the atmosphere into the oceans to be 3.2 G-tons Carbon per year, based on 
estimates of the water leaving radiance, and a regression against carbon flux from the work by 
Mitchell, et al. 1992. To provide a better estimate, and to provide the time series of this flux, 
we must interpolate the pigment images to provide an estimate of the pigment concentration in 
regions for which the data is inadequate. Conventional techniques such as bi-linear interpolation 
and spline fits have given insufficient results because of the large areas of missing data. 

The MCSST data product can be used to understand variations in the sea-surface temperature in 
regions where large data gaps are present because this product has used an interpolated data field 
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to supply missing data values for regions for which clouds obscure the surface during the time 
of the monthly composite. To provide internal consistency between the pigment and sea-surface 
temperature data fields, the same interpolation scheme will be used for both data fields. 

The primary productivity of the sea has been shown to be related to both the standing stock of 
the phytoplankton population, and to the temperature of the sea, which both regulates the 
metabolism of the planktonic organisms and reflects the nutrient status of the sea through a 
physical relationship between temperature and nutrients in newly upwelled water. For these 
reasons, working with Mitchell, et al., we have developed a relationship between the temperature 
of the sea, the upwelled radiance ratio in the photosynthetic bands, and the primary productivity 
of the sea. While this relationship is still under investigation, the regressions that have been 
produced indicate that a strong correlation exists for the flux of carbon through the surface layer, 
and these relationships will be used to produce the first time series maps of the global flux of 
carbon in the sea. 

Through the primary productivity, the standing stock of phytoplankton, as reflected by the 
pigment concentration, is also related to the temperature, although in a very complicated manner. 
It is this relationship that we have exploited in the interpolation of the pigment fields. Figures 
1 and 2 indicate two latitudinal regions in the ocean, one at 15 degrees and one at 35 degrees 
north. In these figures, we illustrate two facts: First that at each latitudinal band, there is a 
strong correlation between temperature and pigment concentration. Second, that the correlation 
is very dif ferent for these two regions. The global picture for the correlation between 
temperature and pigment concentration is shown in Figure 3. These results indicate that we may 
use temperature in the interpolation of the pigment fields, but that the algorithm is both regional 
and seasonal in nature, leading to an exhaustive computational problem if conventional analysis 
were to be applied. These facts have led us to investigate different methods for the interpolation 
of the pigment fields in regions for which sufficient data is not available to provide a satisfactory 
estimate of the pigment to permit an estimate of the productivity. 

The first method that we have examined is the use of a least squares regression using both 
temperature and pigment for the estimate. This technique will find a matrix transformation 
mapping spatial averages of temperature and pigment data and latitude values onto the space of 
pigment values such that the difference between the two sets is minimal in the root mean square 
sense. The variation of input parameters can be extended to include the square or cube of the 
spatial variables. These variables were combined in an equation where the coefficients were 
determined by the least squares technique. This analysis was conducted using the IMSL 
(International Mathematical Statistical Library) software. 

Several polynomials with different variable combinations were used to examine the variability 
of error produced with each equation. The coefficients of these polynomials describe the 
contribution of each variable to the predicted pigment value. The results of studies conducted 
on a restricted data set indicate that a simple linear regression based on the pigment alone gives 
a satisfactory fit to the data from the trial cases. The results suggest further testing on a 
significantly larger data set, using multiple iterations of the interpolation process. 

The second method uses the methods of a neural-net, coupled with a bi-linear interpolation, using 
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both the temperature and pigment fields to form the estimate of the missing data in the pigment 
field, and the temperature field alone for the estimate of the missing temperature values. This 
technique relies on the pigment fields both past and future for the pixel in question, the past and 
future temperature fields, coupled with a spatial interpolation of the pigment field to produce an 
estimate for the missing data pixel. The neural-net system is trained on a data set which has both 
the spatial and temporal coverage appropriate to the data set under investigation, and is used on 
the global data set for all time. The initial data set is shown in Figure 4, which illustrates the 
large areas of missing data in the global pigment images. The results of the intetpolation are 
shown in Figure 5, which illustrates the degree to which the fields may be interpolated using this 
technique. The technique has been verified by removing data from the original data set, applying 
the technique to regenerate the data, and comparing the original data to that replaced by the 
artificial intelligence system. Figure 6 describes the correlation between the predicted pigment 
concentration from the Neural-Net and the pigment concentation from the satellite measurements. 
The correlation coefficient for this estimate is R2= 0.952 . 

For this task we used the most well known neural-net classifier (known as "back-propagation"). 
Back-propagation was introduced originally (Rumelhard, 1986), it was proposed that the criterion 
function to optimized using gradient descent However, it was soon realized that more efficient 
algorithm and training techniques can be employed; the "momentum" term (Rumelhard, 1986) 
is the most popular example of such improvements. This techniques incorporated gradient 
descent and the previous weight change is used to update the weight vector. 

We used seven different variables as input and one output into neural-net program. The inputs 
consist of three from the CZCS pigment field, three from the AVHRR temperature field, and the 
latitude of the center pixel. Figure 7 shows the lay-out input parameters for back-propagation 
Neural-net. 

The relationship between the global chlorophyll data and the temperature product is not well 
defined. Neural-nets have shown the ability to handle multi-dimensional data sets with non-linear 
relationships. 

From these experiments, we conclude that the neural net permits the computation of a globally 
interpolated data field for all time. The results of this study are being evaluated to determine the 
scientific validity of both techniques. 
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Predict Pigment Concentration (mg/m 5 )from Neural-net 



Figure 6 
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Figure 7. shows input parameters used to to train the neural net. Where C 
represents Chlorophyll and T for temperature. Each rectangle represents 

the average pixel for each time slice. Latitude represents the latitude of 
center images. 
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Interpolate CZCS Pigment (mg/m 3 ) January 1982 
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Abstract 

A distributed version of the Clips language, 
dClips, was implemented on top of two existing 
generic distributed messaging systems to show that: 
(1) it is easy to create a coarse-grained parallel 
programming environment out of an existing language 
if a high level messaging system is used, (2) the 
computing model of a parallel programming 
environment can be changed easily if we change the 
underlying messaging system. dClips processes were 
first connected with a simple master-slave model. A 
client-server model with intercommunicating agents 
was later implemented. The concept of service broker 
is being investigated. 

Introduction 

In the process of exploring the opportunities 
of utilizing multiple workstations on a network as a 
single parallel computing environment, we have built 
a simple distributed Clips environment, named 
dClips, running with multiple parallel Clips 
processes on a Sun network. Clips, C language 
integrated production system [Clips91], is a forward- 
chaining rule-based language with object definition 
capability. Clips was developed by NASA Johnson 
Space Center. 


dClips, was implemented on top of two 
existing generic distributed messaging systems to 
show that: (1) it is easy to create a coarse-grained 
parallel programming environment out of an existing 
language if a high level messaging system is used as 
the underlying layer, (2) the computing model of a 
parallel programming environment can be changed 
easily if we change the underlying messaging system. 
In this paper, we describe two versions of dClips 
implementation on top of two different messaging 
systems. One messaging system supports only the 
master-slave model while the other supports a much 
more flexible communication scheme. 

A Master-Slave Model for Task Assignment 

dClips was first implemented on top of AERO 
[Sullivan89], the Asynchronously Executed Remote 
Operations from UC Berkeley. AERO allows 
parallel programming in a master-slave mode on a 
UNIX network. Communication is only allowed 
between the master and slaves, but not in between 
slaves. The master process can assign tasks 
asynchronously, but it has to block and wait for the 
result to come back. 

As depicted in Figure 1, a single dClips 
master process controls multiple dClips slave 



Figure 1. Master-Slave Model for dClips 
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processes. The master process first asks all the slave 
processes to load the necessary Clips constructs (i.e., 
rules, objects, and functions) from the file system into 
their runtime environments. The master then assigns 
tasks to slave dClips processes by one of the following 
three methods: 

i) . Assert a fact into Clips knowledge base — This 
request from the master process is executed on all the 
slave processes simultaneously. If a slave process is 
busy, it first finishes its current task then asserts the 
fact. By asserting facts into the working memory of 
slave processes, the master process could change the 
inferencing process in the slaves. 

ii) . Call a Clips function — Any built-in Clips 
function and user-defined functions in a slave process 
can be called from the dClips master process. This is 
a form of remote procedure call in the context of Clips 
language. Also, the state of the working memory of a 
dClips slave process can be examined by the master 
by issuing Clips function call. This allows the master 
to decide if further task assignment is necessary. 

iii) . Send a message to a Clips object — A Clips object 
message can be sent from the dClips master process to 
dClips slave processes. An active object instance 
within the slave process can receive messages and 
process the messages based on the behaviors defined 
in a message handler. 

This version of the dClips implementation is 
done in C using Clips 5.1. Four function calls are 
available between the dClips master and slave 
processes: loadClipsConstruct, assertClipsFact, 
callClipsFunction, and sendClipsMessage. The 
loadClipsConstruct primitive can take a list of 
construct-files (i.e., a file with Clips rules, objects, 
and functions) and process them based on the sequence 


of the list elements. The sequence in the list is the 
sequence of execution in the loading process. For 
example, the list (function. clp object. clp rule.clp) 
will cause a slave process to load function.clp first, 
object.clp next, and rule.clp last. 

The assertClipsFact primitive takes a string 
with a single Clips fact and requests every slave to 
assert it into the knowledge base. The 
callClipsFunction primitive takes a string with a 
single function name and the function parameters, and 
sends it to all the slave processes. 

The sendClipsMessage primitive is also 
capable of passing a list of messages from master to 
slave. Each message is itself a list in the form: 
(class-name instance-name method-name method 
args). The slave that receives a sendClipsMessage 
request processes the messages based on the sequence 
of the list elements. This allows multiple class 
methods to be defined and executed in sequence as a 
single work assignment. 

An Application 

A Clips-based image data access application, 
DataHub lHandley92], has been ported to the 
dClips/ Aero environment. The master process issues 
concurrent image data access/conversion requests for 
different dataset types. Each dataset type has 
different data format and data semantics. The 
knowledge about the image datasets is stored in Clips 
constructs, and loaded by the slave processes at 
startup time. One dataset type is handled by one 
slave process. There is no interaction needed among 
slave processes. Data conversion tasks are both CPU 
intensive due to format changes (e.g., byte swap, data 
decompression) and I/O extensive due to massive read 
and write of files. 



dClips Slave 
Ocean Data Access 


dClips Slave 
Viking Data Access 


dClips Slave 
Voyager Data Access 


dClips Slave 
Magellan Data Access 



Figure 2. DataHub Data Manager on top of dClips 
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The DataHub data manager with the same 
master-slave task assignment scheme has also been 
ported to the dClips/ISIS environment (see next 
section for details). The master slave model stays 
with the ISIS implementation because of the nature 
of the application, rather than the limitation of the 
ISIS computing model. 

A Client-Server Model with Service Brokers 

The master-slave process model imposed by 
AERO is not desirable if we want to build systems 
with communicating intelligent agents. After 
evaluating Sun's ToolTalk™ [ToolTalk91] and 
Cornell's ISIS [ISIS90] for an alternative distributed 
computing model, we decided to build dClips on top of 
ISIS. ToolTalk was not chosen because: 1) the 
message arrival sequence from multiple senders in a 
network environment is not guaranteed, 2) only 1 
handler is allowed for a request message (others are 
observers). Message arrival sequence is important 
because the arrival sequence of messages for asserting 
a fact or for updating object instances in the dClips 
environment is critical to the local Clips inferencing 
process. Different message arrival patterns could 
result in different inference outcomes. Furthermore, 
the constraint of having a single handler for a 
request message makes it unnatural for task 
distribution/assignment in a parallel programming 
environment. 

On the other hand, ISIS, developed at 
Cornell University, provides a set of tools built 
around virtually synchronous process groups and 
reliable group multicast [Birman91j. A virtually 
synchronous distributed system has the following 
characteristics: (1) all processes observe events in the 
same order (global order and causality), (2) an event 
notification is delivered to all or none of the audience 
(atomicity). A virtually synchronous system looks 
synchronous to every process in the system, but 
executes asynchronously. For the dClips 
implementation, the virtually synchronous broadcast 
(cbcast, for causal broadcast), which guarantees the 
causality and atomicity, was the main reason for 
using ISIS as the underlying distributed computing 
model. 

Figure 3 shows the architecture of ISIS-based 
dClips, where a set of dClips Server processes team 
up with a dClips Administrator process to form a 
process group. This process group provides the 
cooperative problem solving capability to the outside 
world. The dClips Administrator plays the role of a 
service broker , providing a consistent interface to the 


outside clients, while the details of the server 
processes are transparent to the clients. The interface 
between a service broker and its clients has yet to be 
defined. At this point, the CORBA (Common Object 
Request Broker Architecture) IDL (interface 
definition language) type interface is being 
considered [OMG91], The interface between the 
dClips Administrator and the dClips Servers is a 
shared knowledge base with a set of common access 
methods. 

The dClips Servers form a conceptual 
hierarchy, which is known to the dClips world, but is 
not visible to the ISIS environment. In other words, 
this Server hierarchy is not a hierarchy of ISIS 
process groups. In the ISIS environment, all the 
servers are equal members of a single process group. 
Broadcasts to the group will reach every server 
process in the same order. Each server is an 
autonomous problem solving agent with its own 
knowledge base and its own task. The Server 
hierarchy defined within dClips environment helps 
a server to find another potential problem solver if a 
problem cannot be solved locally. 

A shared knowledge base is available to 
dClips Servers for knowledge exchange and 
interaction, which is designed to facilitate the 
cooperative problem solving process conducted by 
multiple dClips Servers. At the same time, each 
dClips server can have its own individual non-shared 
knowledge base. The Server hierarchy is defined as 
a Clips Class Hierarchy within the shared 
knowledge base, which is known to every server. The 
message communication between servers can be: (1 ) a 
broadcast to the whole process group, or (2) a message 
to a designated server. 

The shared knowledge base is realized by having a 
set of Clips constructs replicated in each server. Each 
server loads in this shared knowledge base at 
initialization time. Any update to any of the objects 
in this shared knowledge base in any dClips Server 
will trigger a broadcast of the update to other 
members in the process group. A server applies the 
updates sent in from other servers one by one as if 
they are local updates to the knowledge base. Since 
the shared knowledge is designed to keep only the 
critical knowledge that needs to be shared among 
servers, the size of this shared knowledge base 
should be small. The effort to keep it consistent 
across multiple servers, i.e., sending and receiving 
update messages and applying updates triggered by 
remote update messages, should be minimal. 
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Future Opportunities 

Based on the client-server model of dClips, we would 
like to pursue the following extensions: 

i). dClips with Database Access Capability — This 
involves a dClips database gateway, which runs as a 
database client to some database server. An 
intelligent agent can not be intelligent without 
necessary knowledge about the real world. Accessing 
existing databases is one way of acquiring 
data/knowledge from the outside world. As shown in 
Figure 4 , a database pass-through process can serve as 


the gateway to the database server. The function of 
this gateway can be as simple as passing a SQL 
statement to a relational database system and 
receiving the results back in a buffer. Or it can 
provide more sophisticated functions such as 
allowing joins of tables across multiple database 
systems. 

ii). A distributed blackboard system on top of dClips 
— The ISIS-based dClips implementation can easily 
evolve into a distributed blackboard system. This can 
be done by making dClips server processes run as 
knowledge sources in a blackboard system and by 
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using the shared knowledge base among dClips 
servers as the bluckboutd [Nii86]. A blackboard 
system like this is a realization of the original 
blackboard metaphor because there is no 
centralized control mechanism involved in the 
blackboard reasoning process. Each knowledge 
source reacts only to the change on the blackboard. 
A domain problem can be solved cooperatively this 
way by multiple knowledge sources. 
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Applications of 
Data Compression 
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1, Data Storage 



2. Data Communication 



. 1 



| ' ™ conmunicaDoa 



receiver] 




3. Machine Learning 


lossless = decompressed data 

is identical to the original 

lossy = decompressed data 

may be an approximation to the original 
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Key types of data: 

• text 

• computer source/object code 

• data baste 

• numerical data 

• speech 

• music 

• gray-scale images 

• color images 

• graphics 

• CAD data 

• animation 

• half-tone/ fax data 

• finger print images 

• bank check images 

• map and terrain data 

• medical imagery 

• scientific and instrument data, space data 

• image sequences 

• video 


ORIGIN,*! 

OF POiXl 


T r — f * 


* fit n 




Examples of Speeds Required 
for Real Time Processing: 


Data Compression Research 
at Brandeis 


Text sent over a modem ~ U00 bits per second 
(Depending on the cost of the modem, commonly used speeds 
range from 1.200 bits per second to 9.600 bits per second ) 

Speech - 100.000 bits per second 

(One government standard uses 8.000 sample, per second. 12 
bits per sample ) 

stereo Music - 15 million tits per second 

(A standard compact disc uses 44.100 samples per second. 16 

bits per sample. 2 channels ) 

Picture Phone ~ 12 million (ids per second 

(A low resolution black and white product might require 8 bits 

per pixel. 256x256 pixels per frame. 24 frames per second.) 

Blackic White Video ~ 60 mdlion bio per second 

(A medium resolution product might use 8 bits per pixel. 512 
by 512 pixels per frame. 30 frames per second ) 

HDTV ~ 1 billion bits per second 

i A proposed standard has 24 bits per pixel. 1024 by 768 pixels 
per frame. 60 frames per second.) 


Image Compression with 

Vector Quantization 


IDEA: Map sub- arrays of pixels (“vectors' ) to 
the “closest ” vector in a dictionary of vectors. 


Lossless Compression: 

• Systolic Algorithms 

• High Speed Hardware 

• Optimal Poly-Log Algorithms for Parallel Machines 

• Poly-Log Approximation Algorithms for Simple Architectures 

• Complexity of Off-Line Vs On-Line Encoding 

Image Compression: 

• Ttee-Structured Vector Quantization 

• On-Line Adaptive VQ with Variable-Sized Vectors 

• Visualization Tools 

• Fast “Browsing” of Scientific Images 

• Applications to Scene Analysis and Object Classification 

Video Compression: 

• Real-Time Displacement Estimation with Variable-Size Blocks 

• Sub-Unear Algorithms 

• Hardware Design for Integrated Systems 

• Applications to Machine Learning 


Error Resilient Compression 
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Generic Encoding Algorithm 


1. Initializations. 

- Dictionary D = 256 single byte values 

- Growing Points Pool GPP = one pixel 

2. Repeat until there are no more growing 
points in GPP: 

(a) {Growing Heuristic} 

Choose a growing point GP from GPP. 

(b) {Match Heuristic} 

Find the "best" match block b. 

Transmit \l 0 g 2 \DW bits for the 
index of b. 


(c) {Update Heuristics} 
Update D and GPP. 
(Delete if necessary). 



Growing Strategies 


Wave: 


Lifo: 


r 

\ 

w 


Circular: (Used for all experiments) 


Growing Points 

Partially encoded image 

Partially encoded image 

after adding a new block 


Selected growing point 


New matched block 


The Match Heuristic 

Decides what block (entry) from the dictionary best matches 
the area of the image originating in the selected GF 


MATCH HEURISTIC cont. : 


Parameters: 

• Distortion measure (eg. L2) 

• The threshold 

• Overlapping blocks: 

- FIRST cover j 


- LAST cover 


^ Previously encoded image 
New matched block 



GREEDY: Chooses the biggest block 
satisfying the threshold. 


* Moderated-GREEDY: (Used for all experiments) 
Best match must be significantly larger 
than rivals of almost equal quality. 


AVERAGE i Used for all experiments) 







Dictionary Update Heuristic 


Deletion Heuristic 


' One-Row plus One-Column (Used for ail experiments) 


Added blocks: 


Vertical extension 



Horizontal extension 


■ matched block 
("1 -Previousely encoded image 


FREEZE: Once the dictionary is full, "freeze’ 
it (i.e. do not allow any further entries to 
be added). 


LRU: Delete the entry that has been least 
recently used. 

* FIFO: (Used for all experiments) 

Keep the dictionary entries in a queue 
implemented as a circular array. 


BrainMrSlda: 


BrainMrTbp: 


Woman Hat: 


Test Images 


Cat -scan brain image, 512 by 512 pixel*. 8 biu per pixel. 

i: Magnetic resonance medical image that shows a side cross- 
section of * head, 256 by 256 pixel*. 8 bit* per pixel; this is the 
medical image used by Gray, Cosman, and Ri*km (1991.19921. 

i: Magnetic resonance medical Image that shows a top cro&s- 
section of a head. 256 by 256 pixel*, 8 bit* per pixel. 

: Band 5 of a 7- band image of Donaidaonville, LA, the least 
compressible of the 7 band* by the UNIX compress command 

: Band 6 of a 7- band image of Donaidaonville, LA; the most 
compressible of the 7 band* by the UNIX compress command 

: The standard woman in the hat photo, 512 by 512 pixels. 8 
bits per pixel. 


LivingRoom; Two people in the living room of an old house with light coming 
in the window, 512 by 512 pixel*, 8 bit* per pixel 

FingarPrint: An FBI finger print image, 768 by 768 pixels. 8 bits per pixel; 
include* some text at the top. 

HandWHting: The first two paragraph* and part of the figure of page 165 of 
Image and Text Compression (Kluwer Academic Press. Nor- 
weU, MA) written by hand on a 10 inch high by 7 5 inch wide 
piece of gray stationary scanned at 128 pixels per inch. 8 bits 
per pixel; approximately 1.2 million bytes. 


of- fa rros 

fo*- rnt 54*6 SaJ£ (fSjufi) 


I ilfst) ! 



on * %ooi* 


a agga g I agfetf 
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Simplified Video Compression System 


CortAf-y to+> Corf&K *** fa r ; °s 

fo< foU v* K P-rt6£ 


[names] 


PUtX/mUEE] 

BntnCAT 29 (29) 


FULL / TREE 
4 2 / 4 2 


full atme} 
22 ( 22 ) ; 


BramMR.Side 

29(28) ! 

4 0 / 4.1 

27 (26) 

I 4.8 / 4.9 

21 (21) 

n 

1 10.4 / 

-4 

104 

8 rain MR Top 

27(27) 

2.8 / 2.8 

21(21) 

! 5.6 / 5.5 

15(15) 

! 10.8 / 

10.5 

NASA5 

31(30) ; 

4.1 / 4.3 

28 (28) 

5.5 / 5 5 

26(26) 

' 7.4 / 

7.5 

NASA6 

46(46) ’ 

22.2 / 22.7 

41 (41) 

;?9.4 / 82.1 

39(40) 

1 

1 97.7 / 

1056 

Woman Hat 

32(32) 1 

6.5 / 6.4 

30(30) 

; 9.2 / 8.9 

27 (27) 

! 14.1 / 

14.1 

LivingRoom 

30(29) ■ 

5.1 / 5.2 

27 (27) 

1 7.4 / 7.3 

25(25) 

' 10.5 { 
1 

105 

FingerPnnt 

32(32) ; 

6.2 / 6.2 

24 (24) 

< 27.8 / 27.6 

22 (22) 

; 38.3 / 37.8 

Handwriting 

32(32) I 

16.5/16.5 

25 (25) 

59 2 / 60 8 

17(18) 

; 175.0 / 

1707 


FULL / TREE 

8.9 / 8.8 


"Fair' 

SNR C OMPOHM 

PULL/CHLE^) FULL / TREE 

18(18) ; 12.5 / 12.6 


lotsy preprocesiing of 
individual frame* 




output 


/? £_£. Ovnti'fJ Vi/; \ 

J La- W 

A\jWcL 

Ztttsr 


Displacement Estimation 


Idea : Approximate interframe motion by 

piecewise translation of blocks of pixels. 


(Rotation, zooming, etc., approximated by 
block translation, if blocks are small.) 


The Model of Computation 


input ■ ^ P" pl 



DO PLACEMENT EJTB4ATX3W ENCODER 


Note : Displacement estimation is a crucial part 
of the MPEG standard . 


Input/Output is serial. 

n : number of pixels per frame. 

Each processor corresponds to a block of k pixels 
(i.e. nlk = N x N). 

Controller communicates with only one processor. 

Data for the current and previous frame is 
processed by the grid while data for the next 
frame is filling up the frame buffer 
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Superblocks 


Def. : Superblock at time t : set of adjacent 
blocks with the same DMD at time t-1 


Properties of Superblocks: 

• Superblocks will represent areas of the image 
with the same displacement vector 

• Superblocks may have no prescribed shape 

• Superblocks may grow and shrink from 
frame to frame 


Idea : Use a parallel grid architecture to segment 
each frame into superblocks 


Note : We will not need the monotonicity assumption 



«) KIDS? b) KID S3 «> Prediction of KID S3, block ii»o 4 d) prediction of 
KIDS3, block t 



Snm.out.OQ of tbo froow KIDS3 into tuporblock*. tbo initial block mm .• 1« 


ScitvrtfiC ImAU SSAt/Uqj 




titAjWS tv* Aorrs) 

fAllY A* MT4 

3 # fAAAA 


XOfTtU THtSt JtAoCutU 'Loo*" UKl 

\J\OiO A* AfSftA&W £*#****# 
T&MMS AM 


f ^tUni-sAir festers’. 

SuftA-boocts foe a n>A of s to f 
I &COC-L<~S t 


Co~ vco-rute : 

cf- H Ca+s C/+;~& 
rfpAfTi'vur r£c4r,vc SfCtT /tefo+G, 


Goal: 


-riff w,'c« F\\P6t.;rY yifeo rt~s 
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EOSD1S TESTBED FUTURE PLANS 


Har-l Oi'* ' 4 »iir- 


c 




Navigate _ 

Data 


Hard 0*5*$ 8 Gig' 


Exabyte 120 Tape library 
With 4 05OOC Drives 


" System designed to be totally self sufficient 
» All computers will be hard wired together for speed 
" Dec 5000 only handles browse, orders and ftp transfer 

* Dec Alpha (Navigate) only does navigations (allow spooling) 

* Oec Alpha (Data) only processes tape’requests (Reads/Writes) 

» Exabyte 120 with compression drives will hold l terabyte of data 


• EXPAND PRESENT SYSTEM TO INCLUDE GLOBAL 
AVHRR DATA VIA THE CU DOMSAT ANTENNA FOR 
THE YEAR 1994; ON A TRIAL BASIS DUE TO THE 
LARGE VOLUME OF DATA 

• ASSESS THE CAPABILITY OF THE NEW 
INDEPENDENT STORAGE SYSTEM TO HANDLE THE 
GREATER VOLUMES OF DOMSAT AND OTHER 
SATELLITE DATA 

• EXPLORE THE CAPACITY OF THE NEW NETWORK 
CAPABILITIES TO HANDLE THE TRANSFER OF ALL 
STORED AND NEWLY COLLECTED DATA 

• TRANSITION PRESENT SYSTEM TO VO EOSDIS 
BEFORE DEC 1994; SYSTEM TO BE LOCATED EITHER 
AT JPL OR GSFC (OR BOTH) 

• SHUT DOWN EOSDIS TESTBED DEC. 31 , 1994 (OR 
SHORTLY THEREAFTER) 
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EOSDIS TESTBED TECHNOLOGY TRANSFER 


• DISPLAY AND DATA MANIPULATION SOFTWARE TRANSFERRED 
TO APPROXIMATELY <2,000 USERS (PLUS JPL DIST OF MAGIC > 
1*000); USERS WCLUDE EDUCATION. GOVERNMENT AND INDUSTRY 

• NAVIGATION SOFTWARE HAS BEEN CUSTOM HOSTED AT OVER 8- 
10 SITES IN THE U.S. (ALSO RUN AT 5 DIFFERENT FOREIGN NSTS) 

• CU/CCAR WILL BE FUNDED TO WORK WITH NOAA^ESDIS ACTIVE 
SATELLITE ARCHIVE SYSTEM TO UPDATE THEIR DATA ACCESS 
SYSTEM AND SOFTWARE. BASED N LARGE PART ON THE SUCCESS 
OF THE EOSDIS TESTBED 

• EOSDIS TESTBED SYSTEM TO BE MERGED WITH GRADS TO 
PROVIDE GREATER USER ACCESS TO BOTH SYSTEMS IN AN 
INTEGRATED FASHION 

• EOSDIS TESTBED IS CONSIDERED AS AN ARCHIVE PROTOTYPE 
FOR EOSDIS DEVELOPMENT 

• EOSDIS TESTBED SYSTEM IS TO BE INCORPORATED WITH THE K 
TO 12TH GRADE EDUCATIONAL PROGRAM OF THE ASPEN GLOBAL 
CHANGE INSTITUTE 



DEVELOPMENT OF A TOOL-SET 
FOR SIMULTANEOUS, MULTI-SITE 
OBSERVATIONS OF ASTRONOMICAL 
OBJECTS 


Dr. Supriya Chakrabarti 
Boston University 


August 5, 1993 



Main Goal: 

Make this system available to all 
astronomers... amateur and 
professional 
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Present Telescope Control Configuration 


USER 

Autowatch 

Client 

Telescope 

Observation 

Monitor 

Autonin 

Telescope Supervisor 
Program 




NETSERVER 

Central Dispatcher Program . 


1 USER 


1 WORLD 



Master Controller 

PC/AT 


Detector Controller 

PC/AT 




Error Monitor 
PC/AT 


LANTASTIC 

LAN 


Telescope Controller 

PC/AT 


Observer 



Object/obaervmg > 
parameters: 

eg fitters, eiposure 
times RA DEC. etc 


TELESCOPE CONTROL PROGRAM 

telescope 

request 


autorun. c 

1. Links the user to one telescope at a time 

2. Provides a graphical user interface panel which dis- 
plays: 

- telescope information 

- parameter values 

scrollbar window where text messages are displayed 


TELESCOPE MANAGEMENT PROGRAM: 
Telescope Database used to determine if a 
suitable telescope is available b/or if the 
ob|ect requested is observable 


No suitable 
telescope 
available U/or 
observation not 
possible l te 
weather obieci 
position etc i 


Telescope found 
& observation 
possible 


control.c 

1. Allows simultaneous requests to more than one tele- 
scope from a single user 

2. Each user/telescope link has a graphical user interface 
panel for control of each individual telescope bv the 
observer. 


TELESCOPE 

INTERFACE 


Results/error 

messages 


( starts observation ) 
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autowatch, c 

1. Allows monitoring of an observation in progress 

2. A user can have several 'autowatch' processes running 
simultaneously, each monitoring a different telescope 

3. Same graphical user interface panel used in autorun. c 

- All controls in the autowatch. c panel are disabled 
except the scrollbar in the window where text mes- 
sages are displayed. No interaction between the watch 
and the telescope is allowed. 











r 


r 






SZWU - 
BXTPI X - 
NAJCIS 
EXTEND - 
OBSEAva - • AutoScop* OD*«rv«tory‘ 
’ NSTRUME “ RP -300 Phot CK3 1 txi* Phot 
TELESCOP- ' AutoScop*- 10’ t 

ACCISBiT- " . 0000E+01 / 

DETECTOR- 'CCD- 100 CCD Q 


Eil« Ooaj conform to TITS »t«nd«ra 
nu*fr»r of bit» p«r <Ut* pi*ti 
nv»b*r of a*t* *xes 
FITS cutset m*y cort*in ticensiona 

»t«r' 


FNTLIMIT- 
SITELAT • 

S ITT LONG- 
NDFILTER- ' 
OIAPRRGM- 
NAGNITUD- 
CLRINDEX- 
HAGSORCE- 
QLTTINDX- 
REMAARS - 
EXP TIKE - 
COUNTS - 


e . ooooe+oo 

B. 4180-314 

. 1220-314 


I .SOOCE+Ol i pnotometer reau 


GENEVA 0- 
genzva'r- 
GENEVA~G- 

jdtime' - 

FLTMUM - 
RETCODE - 
EXP TIME - 
COUNTS - 
GENEVA 0- 
GEHEVA*R- 

geneva'g- 

JDTIME - - 
FLTWUM » 
RZTCOOE - 


0. OOOOE+OO 
0 . OOOOE + OO ; 
0. OOOOE + OO 
2.449 1E + 06 


; . ooocE+ci 

0 

3. OOOOE+OO 
0. OOOOE+OO 
0. OOOOE + OO 
I . 4491E+G6 


phot ontcei results 


B filter 


Networking Issues 

• Robustness 

-No single error should crash the system 

• Fault Tolerance 

syntax and system errors 

• Security 

Password 
User ID 

Encryptions (not yet implemented) 

• Data Compression (not yet implemented) 


V. 
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Software Issues 

• Portability 

UNIX. C 

• Support 

Multi- Vendor Platform 
-Sun UNIX 
-MS-DOS 

• Public Domain 


Networking - NETSERVER 

• Central Dispatcher for entire network of tele- 
scopes and observers 

User connected to telescope by Modem 

- PPPRELAY Procedure 

User linked to telescope via Internet 

- autorun Procedure 
User at telescope site 


V 


J V 
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User Interface 


Users 


• l MX 

• PC 

Most amateur astronomers use PC's 

• Present User Interface Software 

Tool Command Language {tell in C 
ToolKit (tk) in C 

• FITS Formatted Data Files 

FITS is the standard astronomy format 

• X- windows 

• Graphical t’ser Interface 

• Scheduling 

Priority based 


• Located at Telescope 

• Remote Observer 

Connected via Internet or modem 

• Remote Watcher'" viewing permission onlv 

( No Interaction allowed) 


r 


j v 
^ r 


Telescope Interface 

• UNIX 

• PC 

• Modes of Operation: 

‘Batch’ mode * senpt written in tel 
full or partial night observing routine 
fully automated 

Interactive mode 
1 graphical user interface (or) 

2. ATIS individual commands 

• ATIS Batch files 


Status 

• Hardware Exists 

• All Networking Tools in Place 

• Some User Interface Written 


J 


J 
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Test Results 

• ATIS session via Internet 

- Batch L Interactive/ reai- time Observing 

• ATIS session using modem link 

- Batch' Jj Interactive/ real- time Observing 

• Multiple Telescope Session 

• Used tel scripts for specific observ- 
ing sequence 

- Scripts written by User 

- Sent to telescope bv internet and modem 

• Robustness: 

- During Multi-Telescope Session: 

1. Disconnected one telescope 

2. Connected to another telescope 

3. Used autowatch' on one or more sites 

- Discontinued autowatch during autorun use 

• Fault Tolerance: 

- Syntax errors resulted in graceful' exit 


Future 

• Improvements to User Interface 

- put reference catalogues on line for guid- 
ing/tracking 

- move telescope according to object name 

• Test using real telescope 

• Do real observing 

• Add Other Capabilities: 

- Imaging 

* Spectroscopy 

- Interferometry 

• Add Other Platforms 

• Add Analysis Tools 

• Add Artifical Intelligence Tools for 
Observation Scheduling? 



J V 


J 


B-133 






GEOGRAPHIC MFORMATUN 8YS T EH ^ PUSI OH 
ANALYSH^N^SOUmC^iOTC SENSING 


Or. Anthony Fr toman 
Jot Propulaton Labortiory 


PROGRESS REPORT 


• Started out »o adapt VICARflBIS CIS to Indudar 

data aata (aapacMIy radar Imagaa). 

•Achieved a trortdn* verelon. 

•Poor uaar Jntactaca. 


•O a vatopad MacSlgma g. 

•Suparvteed etaaadteatlon (d radar hnagaa>. 


aimpHtyfng data dfptmy, analyaM, HrfrjxMMtion. 


•Ftt 3-componont acattaring rr 
•Surfaca (BRAGG) Scattar 
-O ou to ta bo u nca 
-Voluma 

(Fraaman and Durdan, 1 992) 


I to AIRSAR data. 


JPL 


SPACEBORNE IMAGING RAOAR-C 

SIR-CED 


SCATTERING MECHANISMS 


Smooth Surfaca 


Rough Surfaca 




Rough Suhac* 
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SYNTHETIC APERTURE RADAR IMAGE OF THE FLEVOLAND 
AGRICULTURAL SITE IN THE NETHERLANDS 


Water 

Forest 

Stem 

Potatoes 

lliCefTV# 

Winter wheat 
Peas 

Sugar beet 



•No vegetation 
-Low vegetation 


-Surface scatter dominant at all 3 frequencies 
-Of total becfcacatter very low. 

•Low Vegetation 

-Volume scatter dominant at C-band but not at l-band. 
-Medium Vegetation 

-L-band volume A double-bounce high. 

-P-band volume scatter low. 

■Forest 

-L-band aod P-band volume scatter high. 

■Urban 

-Total bacfcs ca ttar high at L-band and P-band. 

-Ooubie bounce > volume at L-band and P-band. 
-Volume scatter not dominant at C-band. 
■"Double-bounce* vegetated areas: 

-Double-bounce > 1/3 volume at a ppropria t e frequency. 


Summer barley 
Roads 
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ENVISION: A SYSTEM FOR MANAGEMENT 
AND DISPLAY OF LARGE DATA SETS 


Kenneth P. Bowman 
Texas A & M University 


August 5, 1993 


Envision: A System for Management 
and Display of Large Data Sets 


Kenneth P Bowman and Sndhar Pathi 
Department of Meteorology, Texas A AM University 

Keith Seanght. John E. Walsh, and Robert B Wilhelmaon 
Department of Atmospheric Sciences. Univ of Illinois at L rbana-Champaign 



Goals 

Integrate data management, manipulation, analysis, 
and display functions into a single interactive 
environment 

Use standard, portable data storage and 
management tools to provide access to multi-GB 
data sets 

Provide a simple, intuitive, collaborative, portable 
user interface 


Envision Data Manager 

Data server 
Project file 

• Groups one or more data files into a Project 

• Virtual' dimensions 

• Data ‘files* larger than 2 GB 

• Updating meta-data without copying netCDF 
files 

• Adding meta-data to read-only files 

• All data files remain as netCDF files (currently) 


• Enhance the capabilities of existing interactive 
visualization software developed at the National 
Center for Supercomputing Applications (NCSA) at 
the University of Illinois 


• Provide interfaces to other graphics display systems 


• Distribute functional components 

• Public domain 


J 
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Data storage 

• NetCDF (UCAR/ UNIDATA) 
Self-descnbing 

Gridded. multi-dimensional arrays 
Extensible 

Random access to any hyperslab 
Machine independent 

• HDF (NCSA) - Hierarchical Data Format 

• EOSDIS 





















INTERACTIVE ANALYSIS 
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SCIENCE OBSERVATION VISUALIZATION 
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THE EOS/PATHFINDER INTERUSE EXPERIMENT 






data sets 

Issues regarding the Interuse o! Level 2 and Level 3 data will be considered 














messages. Currently supports urux workstations. Working Demonstrate high speed (1Mbps**-) wireless connectivity 

with software vendors to support PCs and Macs. to portable computers with special emphasis on support 

for advanced applications like packet video. 






